r/econometrics

I built a blog where I explain data, economics, and statistical modeling through original articles

Hi everyone,

Over the past few months I've been writing long-form articles that combine economics, statistics, and data analysis in a way that's meant to be accessible without sacrificing technical depth.

The blog is completely free, has no ads, and doesn't require registration.

Some of the topics I've covered include:

- Monte Carlo simulation for real estate investment.

- Time-series modeling with VAR/VECM.

- Economic intuition behind statistical models.

- Data-driven explorations of real-world questions.

My goal isn't to publish academic papers, but to explain quantitative ideas clearly and encourage discussion.

If this sounds interesting, I'd really appreciate your feedback—both on the content and on how the articles could be improved.

https://www.inquiry-journal.com

Thanks for taking a look!

reddit.com

u/Ambitious_Maize9809 — 3 days ago

▲ 7 r/econometrics+1 crossposts

Economics or Econometrics Major?

I’m a commerce/finance double degree student thinking about what commerce major to pick to pair with finance. I think I enjoy economics, particularly macro but the value of econometrics keeps coming up.

If my desired career is in finance, and my interests (while broad) are investing, banking, possibly trading, what would be a better choice.

I’m aware of the maths difficulty involved in econometrics, and it does worry me a bit, because while i’m good numerically and statistically, i have trouble with more complex maths like calculus and probability, although I don’t want to rule it out because i can pass with some effort.

Thanks for any advice

reddit.com

u/SnooFoxes2013 — 3 days ago

▲ 19 r/econometrics+2 crossposts

2026 Econometrics Summer School Cambridge - With Jeffrey Wooldridge and Melvin Weeks | INOMICS

Free scholarship available for the 2026 Econometrics Summer School Cambridge

Timberlake Consultants is offering a free scholarship place for the 2026 Econometrics Summer School Cambridge, taking place at the University of Cambridge with Professor Jeffrey Wooldridge and Dr Melvyn Weeks.

The Summer School is designed for PhD students, researchers, applied economists, and professionals who want to strengthen their empirical skills in causal inference, Difference-in-Differences, machine learning, and applied econometrics using Stata.

The programme includes:

Course 1: An Introduction to Causal Inference and Difference-in-Differences using Stata
Taught by Professor Jeffrey Wooldridge

Course 2: Causal Inference and Machine Learning using Stata
Taught by Dr Melvyn Weeks, University of Cambridge

Participants can attend one course or the full five-day Summer School.

The scholarship covers the full tuition/training fee. Please note that accommodation and travel costs are not included, although accommodation is still available at Cambridge.

More information about the course is available here:
https://inomics.com/course/2026-econometrics-summer-school-cambridge-with-jeffrey-wooldridge-and-melvin-weeks-1554362

To apply for the scholarship, please send your academic transcript and a short motivation letter to:
edu@timberlake.co.uk

Feel free to get in touch if you have any questions, we are offering a last minute 30% discount to Students and academics!

inomics.com

u/Francisca_Carvalho — 3 days ago

▲ 17 r/econometrics+5 crossposts

u/No_Nerve_1329 — 5 days ago

▲ 10 r/econometrics

Econometric model selection advice needed

Im working on my bachelor thesis and would appreciate advice from someone, who knows something about the topic.

I have normal cross-sectional data. My dependent variable is set as an index between 0 and 100 measuring university prestige. So I think i need some truncated model to fit that. If i understand it correctly, the main limitting factor in these models is the function I should use, right?

Is there some standard model used for these cases? Which assumptions are needed to be fulfilled?

reddit.com

u/RedditorVirgin — 7 days ago

▲ 36 r/econometrics+1 crossposts

onet2r: archived O*NET releases, OEWS/PUMS weights, and reproducible occupation measures in R

I maintain onet2r, an R package for working with O*NET data, and just shipped a
larger update (the 0.4 line).

The problem it tries to handle: O*NET is useful but is not a clean longitudinal panel. The Web Services API serves the current release. Historical comparisons need the archived database files, O*NET-SOC taxonomy bridges, and some care about whether a changed value is a real update, a stale carryforward, a transition row, a suppressed estimate, or a taxonomy seam.

What it does:
- reads archived O*NET releases into one normalized panel
- keeps native 8-digit O*NET-SOC codes and derives 6-digit SOC codes for joins
- reconciles two releases and flags rows that are not safely comparable
- validates a user-supplied task or occupation measure against a pinned release
- rolls task scores up to occupations using O*NET task ratings
- builds OEWS or PUMS employment weight panels with coverage and provenance
- decomposes aggregate change into within, between, interaction, and unclassified

Docs with runnable examples: https://farach.github.io/onet2r/
GitHub: https://github.com/farach/onet2r
Blog: https://workforcefutures.net/blog/onet2r-release/

The hardest design question, and where I would value opinions: how much comparability checking should be automatic versus left visible for the analyst? Right now I lean toward visible, with explicit flags, because a clean-looking number hides too much. Curious whether people who use O*NET historically agree.

u/farach — 8 days ago

▲ 133 r/econometrics+5 crossposts

UK wages only just got back to 2008 levels. If pre-2008 growth had continued, you’d be earning £10,700 more per year right now.

I made a video digging into UK wage stagnation and the numbers are honestly worse than I expected, so wanted to share the key findings here.
The headline stat, from the Resolution Foundation and LSE: if UK wages had kept growing at the rate they were before the 2008 financial crisis, the average worker would be earning £10,700 more per year than they actually are today. Not £10,700 more than 2008. £10,700 more than what 2008 would have grown into. Median UK wages only returned to 2008 levels (in real terms) in 2025. Seventeen years of basically zero net gain, while most other rich countries saw real wages grow 8–10% over the same period.

A few other things that stood out researching this:
Productivity collapsed and nobody really fixed it. UK productivity growth dropped from ~2%/year pre-2008 to under 1% after, and actually went backwards in 2024. Root cause: chronic underinvestment. UK manufacturing capital intensity (machinery/tech per worker) is 47% below peers like Germany, France, the US. Business investment is second-lowest in the G7.

Fiscal drag is quietly taxing pay rises into nothing. Income tax thresholds have been frozen since 2021 and are now confirmed frozen until 2031. As wages rise even slightly, more people get pulled into higher tax bands without any actual tax rise being voted on. Result: 5.76 million people now pay the 40% rate, up from 3.83 million in 2019 — a 50% jump in five years, and most of them aren't what you'd call "high earners."

Housing is eating whatever's left. English private renters spent 36.3% of income on rent in 2024 (ONS considers 30%+ unaffordable, and we've been above that every year since 2016). In London it's 46%. 16-24 year olds renting privately are at 46% nationally too.

Inequality makes it worse than the average suggests. Low-income UK households are 22% poorer than their equivalents in France. The gap at the bottom is more than double the gap at the top — the people losing out from stagnant wages aren't the ones who'd be cushioned by a "typical household" stat.

Full breakdown with all the sources (ONS, OBR, IPPR, Resolution Foundation/LSE, HMRC) is in the video if anyone wants to go deeper:

https://youtu.be/mWprlul6vDc?is=4AUM-RoGGb9JDBa2

Thank you and have a great weekend

u/theionarr — 14 days ago

▲ 72 r/econometrics+12 crossposts

The sample mean as a projection onto the span of the ones vector

I’ve been thinking about the sample mean from a linear algebra perspective.

If y is a data vector and 1 is the vector of all ones, then the average can be seen as the scalar you get when projecting y onto span(1).

So the projection has the form:

y-hat = y-bar · 1

where y-bar is the usual sample average.

I like this because it makes the average feel like the simplest possible least-squares problem: find the constant vector closest to the data vector.

It also connects naturally to ordinary least squares regression, where y gets projected onto the column space of X instead of just the one-dimensional space spanned by 1.

Does this seem like a good way to introduce projections/least squares, or would you teach it differently?

youtu.be

u/CubionAcademy — 12 days ago

▲ 13 r/econometrics+2 crossposts

Hayashi – an open-source DSL for applied econometrics (with a 307-page book, free)

I've been building Hayashi, a domain-specific language for applied econometrics, for the past year. It's a solo project, open-source (GPL v3), written in Rust.

The goal is a language where econometric intent reads directly from the code — no boilerplate, no package hunting, no version conflicts. Something between Stata's expressiveness and a proper programming language.

What it covers so far:

OLS, IV/2SLS, panel data (FE/RE/FD), DiD, RDD, GMM, quantile regression, logit/probit, Poisson/NB, Tobit, survival analysis (Cox/KM), PSM, synthetic control, ARMA/ARIMA, VAR, cointegration/VECM, GARCH, ARDL/ECM, Kalman filter, SUR, Lasso/Ridge, bootstrap, PCA, structural breaks, regime switching, and more.

Book: a 307-page reference manual is available in both English and Portuguese (PT-BR), covering the full language and all estimators with worked examples.

GitHub: https://github.com/sheep-farm/hayashi

Still early — v0.2.4. I'm especially looking for people willing to test it on real datasets and report what breaks. The book has a whole section on this, but real-data testers are the biggest gap right now.

Happy to answer questions about design decisions, what's planned, or why I built this instead of just writing an R package.

One practical note: Hayashi ships as a single binary — no installation wizard, no external libraries, no dependency management. Just download and run. Works on Windows, macOS, and Linux. The only exception is ODBC connectivity, which requires the system ODBC driver if you need it.

u/UnlikelyFuel5610 — 9 days ago

▲ 36 r/econometrics+1 crossposts

What is a Time Series [EDUCATIONAL GUIDE]

Learn what a time series is, how quantitative analysts use them in financial markets, and common techniques that are applied in practice to equity research.

A time series graph describes a series of data over the course of time. Understanding the mathematical structure of that data is the foundation of this field of study. All price data of a stock, change in Net Income Year-over-Year, Interest rate flucuations, and so much more is plotted with a time series. Model development including trend analysis, volatility, and forecasting rely on the foundation of time series analysis.

This guide introduces the core concepts of time series analysis and explains how techniques that are often applied to financial markets.

What is a Time Series?

A time series is a sequence of data points collected at successive points in time — usually at uniform intervals. In financial markets, time series are everywhere: daily closing prices, quarterly earnings values, intraday bid-ask spreads, and rolling volatility estimates are all examples. The defining characteristic is that order matters. Time goes from point 1 to 2 to 3... and values must match that. It can't jump from point-in-time 1 to 3 then back to 2, etc. Unlike a cross-sectional dataset (e.g. each availble stock's current price at a given time), each value in a time series is tied to a specific moment, and there is only one Y value for each given X value for time. We use a time series to measure changes over time for one option, and we use a cross-sectional analysis to measure changes between options given an instant in time. Time series analysis tries to find meaning in why the datapoints change over time.

Historically, stock prices have been seen to behave with a sort of Brownian Motion (ie. randomness) so a time-series analysis of stock prices, and "trying to predict future prices" can be seen as futile. We agree. However, Systems Capital would like to provide some insight into this discussion. One certainty we find in investing is that if we invest for the long term with good people doing good work, our investment appreciates. We find it incredibly valuable (and perhaps even mandatory) that we track our managers' performance over time, and play our role as owners as best as we can. Performing a time-series analysis on Net Income Year-Over-Year, the company's hitoric solvency ratios or credit scores, or even the performance of the management's publically stated goals over time helps to see how they have changed over time. This is something we do for all our investments, and we find this application of time-series analysis to be largely beneficial for our investment portfolio. In addition, this sort of analysis does allow us to better predict future prices - albeit more long-term in practice. By defining characteristics that are reasonable within the comanies historic performance, we can set future estimated price targets (e.g. if all else stays the same, but we can expect the company to lower their cost of revenue by 50% by the end of the year, we can have a better estimate of what the stock price will be by the end of the year.)

This article will go over some major concepts that firms consider when modelling a time series.

Stationarity

Before applying most forecasting or modeling techniques, a time series must satisfy a property known as stationarity. A stationary series has a constant mean, constant variance, and an autocorrelation structure that does not change over time. Intuitively, a stationary series oscillates around a fixed level rather than trending steadily upward or downward.

Raw stock price levels are almost never stationary — prices drift over time, causing their mean and variance to change continuously. Daily log returns, however, are typically much closer to stationary. Transforming a price level into a return series by taking first differences is therefore a standard pre-processing step before building any model. The Augmented Dickey-Fuller (ADF) test is the most widely used statistical tool for confirming stationarity. Failing to ensure stationarity before modeling often leads to spurious results — apparent relationships that dissolve when examined properly. Working in return space rather than price space is one of the most important habits in quantitative finance.

Autocorrelation

Autocorrelation measures the degree to which a time series is correlated with its own past values. High autocorrelation at lag 1 means that if yesterday's return was positive, today's is more likely to be positive as well. Near-zero autocorrelation suggests returns are essentially random from one period to the next, consistent with the behavior expected in a highly efficient market.

The Autocorrelation Function (ACF) plot visualizes this by computing the correlation between the series and lagged copies of itself at each lag. By definition, lag 0 always produces a correlation of 1.0 — a series is perfectly correlated with itself. If the ACF decays gradually, the series has persistent momentum. If it drops immediately to near zero, consecutive returns are largely independent. In practice, daily equity returns often show near-zero autocorrelation in raw returns, but meaningful autocorrelation in squared returns, where today's large move predicts a large move tomorrow, even if the direction is unknown is found in volatility modeling frameworks like GARCH (Generalized Autoregressive Conditional Heteroskedasticity).

Moving Averages

Moving averages are among the most popular applied time series tools in finance. They are largely used in day-trading. A Simple Moving Average (SMA) smooths a noisy series by computing the arithmetic mean of a rolling window of n observations. An Exponential Moving Average (EMA) assigns decreasing weights to older observations, making it more responsive to recent price changes without entirely discarding historical context.

How different windows of moving averages intersect with others (like the MA for a 20 Day window and a 50 Day window) are a popular signal. When a short-period MA crosses above a long-period MA, it may indicate the beginning of an upward trend. The reverse signals a potential downtrend. The appeal of moving averages is their simplicity — they require no assumptions about the distribution of returns, they adapt naturally to different time frequencies, and they can be combined in straightforward ways. Their main limitation is that they are inherently lagging indicators, responding to price changes only after they occur. This lag can reduce performance in choppy markets. Moving averages are most often applied to price charts to derive patterns within intraday performance.

ARIMA Forecasting

ARIMA — AutoRegressive Integrated Moving Average — is a classical statistical model for forecasting time series. It combines three components: the autoregressive (AR) term uses past values of the series to predict future values; the integrated (I) term differences the series to achieve stationarity; and the moving average (MA) term uses past forecast errors to refine predictions. The three parameters are written as ARIMA(p, d, q), where p controls the AR order, d the degree of differencing, and q the MA order.

In quantitative finance, ARIMA and its extensions are used to generate short-term price forecasts, model volatility dynamics, and identify deviations from expected behavior. The confidence intervals produced by an ARIMA model widen as the forecast horizon extends — a natural reflection of uncertainty growing over time. Some use these time intervals not only for point estimates but to set probabilistic bounds around expected outcomes and structure risk accordingly. ARIMA models have some extensions as well. SARIMA accommodates seasonal patterns, and GARCH models — which can be viewed as time series models for conditional variance — directly addresses the volatility clustering commonly observed in equity returns.

How to Use These Together

Time series analysis is most powerful when its tools are layered together. Begin by testing for stationarity before building any model — working on the raw price level rather than returns is one of the most common and costly mistakes in quantitative finance. Use the ACF plot to identify whether autocorrelation exists at meaningful lags, and decide whether a momentum-based or mean-reversion framework is more appropriate for the security in question. Apply moving averages as a real-time trend filter to smooth out daily noise before acting on signals. Deploy ARIMA or its extensions when structured short-term forecasts with explicit confidence bounds are needed. Or don't do any of this, and just have a better understanding of time series modelling.

Knowing all of this allows these frameworks to provide a more complete picture of how a security behaves through time, and it allows you to find the answers to your questions, even if the question is "How should I expect the company to perform this year?".

r/econometrics

I built a blog where I explain data, economics, and statistical modeling through original articles

Economics or Econometrics Major?

2026 Econometrics Summer School Cambridge - With Jeffrey Wooldridge and Melvin Weeks | INOMICS

Econometric model selection advice needed

onet2r: archived O*NET releases, OEWS/PUMS weights, and reproducible occupation measures in R

UK wages only just got back to 2008 levels. If pre-2008 growth had continued, you’d be earning £10,700 more per year right now.

The sample mean as a projection onto the span of the ones vector

Hayashi – an open-source DSL for applied econometrics (with a 307-page book, free)

What is a Time Series [EDUCATIONAL GUIDE]

What is a Time Series?

Stationarity

Autocorrelation

Moving Averages

ARIMA Forecasting

How to Use These Together

Suggested Reading

Good sources for learning on my own

Master's in Economics from Non-Econ background

[Collaboration] Analyzing Luxury Watches as Alternative Investments (5-Year Auction Dataset)