u/WinChurchill

Data 145 Thoughts

Now the semester is over and grades are jover, I think I can post about this new class. I really think it sucks when a new class comes up and everyone wanders on reddit. So you are welcome i guess? Obviously this is a biased statement, but you know, what can you do about it : D

What is this class?

It is a introduction on various statistical method (vague lmao), developed originally from a 2-unit connector course taught by the one and only Prof. Ani Adhikari, as well as Prof. Will Fithian (who teaches Stat 210A, the intro phd stat class)

Topic wise, it is most similar in coverage to Stat 135: Concepts in statistics, a core requirement for the statistics major. It seems like 145 is more a bit of a superset in comparison, just like how Data140 is mostly a superset to Stat 134 (afterall, both data classes were designed by Prof. Adhikari in significant proportion); notably, Data 145 had Data 140 as a prerequisite, so it assumes regression proof knowledge, MCMC stuff, and covers bayesian approaches and stuff like model misspecifications more extensively. Stat 135, on the other hand, gave a more coordinated treatment of various two sample tests, from my impression of that class.

Perhaps it is better to use the exact wordings from Prof. Adhikari herself:

>As Jingyuan has said, 145 has more prereqs than 135: Data 140 or EECS 126, Data 100, and multivariable calculus or a course that makes significant use of it. My experience teaching both classes is that 145 is pitched at a higher level of abstraction than 135, and its topics are somewhat more focused on current applications.
When I teach it next year, I will probably cull a bit more of the classical stuff and replace it with more content suited for ML/AI, though you should keep in mind that the current semester's CS 189 has spent quite some time on classical theoretical stat including likelihood ratios.
Note: "current semester" refers to SP26 Listgarten iteration of CS189, which will not be reflective of later semester's content due to CS new policy.

Also,

>I've taught Stat 135 multiple times, and the Data 188 inference seminar twice. My sense is that most students would not find Stat 135 to be harder than 145. As I've said in one of the other related threads, students should think of the level of 145 as somewhere between Stat 135 and Stat 210A.

As far as overlaps with other classes goes, here's another account from Prof. Adhikari:

>This is quite different from my opinion. We have made a careful analysis of content and have no reason to reproduce an existing course. There are about 8 lectures' worth of material in common with Stat 135, and about 3 in common with Data 102, and even those will have differences in approach.

Who am I?

Obviously I'm not gonna say who am I on reddit, but to give you a bit of background, I did not do remarkably well in 140; however, I did study it more carefully afterwards, filled all other prereqs, and did 102 beforehand. People who know me probably know who I am, though 💀

Difficulty?

This is one of the more intellectually challenging course I have taken in so far Berkeley. I would consider that this course probably took more time in the beginning of the semester for me to digest than CS189; moving toward the mid/end of the semester I feel like I spent more time on 189 than 145 due to the higher abstraction of concept, as well as a lot, lot more math.

For future references, it is likely that the newer iterations of CS189 (as per the announcement from EECS101) would be easier and less mathy than this; the specific SP26 Listgarten iteration was on par/more difficult in terms of difficulty of the statistical and math material, with the main overlaps being MLE proofs, bayesian stuff, KL divergence (though obviously 189 covered it to a lesser extent and focused on its application in cross-entropy/loss).

The cohort was very self selecting at the beginning of the semester already (it required an application for background vetting). By the time it reached lecture 6-7 (OoOOOoO), about 1/3 of class dropped (from approx 80 to 50 ish).

Practice packets were quite challenging, and quite difficult to traverse through without help. Help was, however, very much available, as discussions were directly exercising on the practice packets. GSIs this rotation both took Stat 210A and they are quite goated.

The probabilistic and mathematical material prerequisites are very much required and assumed. You will struggle if you didn't have a good grasp of Data140/EECS126 materials in their entirety. I heard that stat134 didn't count as sufficient background, though on that I am not so sure. I feel like you can do it without the extra 6-7 chapters covered by 140.

Exam was, however, not very difficult, and the grade bins were very much generous (thank you prof omg i thought i was gonna die after that final cuz i didnt study that much as i folded mentally by the time of my fourth final).

I really did enjoy the class, though my workload this semester (whoops, this + 2 CS classes, my bad chat) and honestly, my procrastinative self (whoops) have prevented me from studying well for the latter 1/3 of the course. I would stil recommend the class.

Lecture Style

Prof. Adhikari is one of the best lecturers on campus and probably has a cult following now. On top of clear explanation of materials with excellent lecturing practices, she was able to talk to you about fun stories in statistics.

>It was noted in one instance that David Blackwell (known for promoting Bayesian statistics and probably, Rao-Blackwell theorem) prosed a question beginning with, "Suppose a putty (pin) has a two-third probability of landing on the flat side" before he was disrupted by an young Adhikari, who then was studying under a frequentist advisor in Berkeley, questioning what "supposed" mean. After failing to move forward the class due to her persistent question on the word of choice, Blackwell allegedly shouted to Adhikari something along the line of "Your advisor and their (frequentist) idea will be trashed out from the field of statistics." The lecture resumed afterwards.
Note: Frequentist ideas were not trashed out and still remains largely mainstream today, though now Bayesian approaches/de-facto Bayesian toolings is becoming more available with more accessibility to computing power, libraries such as PyMC, and data streams that is best represented by a prior-likelihood representation

There are much more secret connections to big names that you have probably heard that would activate a neuron of a statistic student (Blackwell? Rao? Neyman?). The discovery of new tales is lefted as an exercise for the reader.

Prof. Fithian is clearly smart; however (you can probably tell im gonna say), his lecture style is bit less clear compared to Prof. Adhikari due to the sometimes more 'methodical' wordings. He is not as bad as some CS/Stat/Econ department lecturers that you probably have experienced, though. In fact, I think he is quite above average; but the 9am climb to physics building clearly didn't help positively. I prob will die if i take 210A tho.

My biggest grip about Prof. Adhikari is her insistency on not having recordings available. I understands that she wanted to promote attendence and I, too, understand that it is simply different to listen to lectures in person than speed 2x online; however, that leaves no option for people who got sick (remember this spring flu/covid season? that was absolutely insane).

Overall

Bottom line is, you should probably treat this as a 'relatively heavy' tech. Something between stat 135 and 210a. Something that is relatively mathy and covers a lot of concepts quickly. Something that you will need to go to lectures for. I recommend this class who has filled all prerequisites for the class.

For further info:

https://edstem.org/us/courses/22867/discussion/7188336

https://data145.org/

reddit.com
u/WinChurchill — 1 day ago