Amanda has correctly anticipated the direction I was headed in i.e to show that regardless of how small the size of the OOS period relative to the IS period, the Johansen procedure by itself is unable to produce a cointegrating vector capable of yielding a portfolio price process that is stationary out of sample. But her iterative Kalman Filter approach is able to cure the problem.
I don't want to gloss over this finding, because it is very important. In our toy problem we know the out-of-sample prices of the constituent ETFs, and can therefore test the stationarity of the portfolio process out of sample. In a real world application, that discovery could only be made in real time, when the unknown, future ETFs prices are formed. In that scenario, all the researcher has to go on are the results of in-sample cointegration analysis, which demonstrate that the first cointegrating vector consistently yields a portfolio price process that is very likely stationary in sample (with high probability).
The researcher might understandably be persuaded, wrongly, that the same is likely to hold true in future. Only when the assumed cointegration relationship falls apart in real time will the researcher then discover that it's not true, incurring significant losses in the process, assuming the research has been translated into some kind of trading strategy.
A great many researchers have been down exactly this path, learning this important lesson the hard way. Nor do additional "safety checks" such as, for example, also requiring high levels of correlation between the constituent processes add much value. They might offer the researcher comfort that a "belt and braces" approach is more likely to succeed, but in my experience it is not the case: the problem of non-stationarity in the out of sample price process persists.
For a more detailed discussion of the problem see this post: Why Statistical Arbitrage Breaks Down
I was hitherto unaware of any methodology for tackling this problem, which is why Amanda's discovery is so important. As she demonstrates in her latest post, the iterative Kalman Filter approach is capable of producing a stationary out of sample process, based on the initial estimates of the cointegrating vector derived from the Johansen procedure.
In fact, Amanda's discovery is important in two fields of econometric research: cointegration theory and the theory of Kalman Filters in modeling inter-asset relationships where, as with the Johansen procedure, KF models have traditionally suffered from difficulties associated with nonstationarity in the out of sample period.
It's a tremendous achievement.
So, despite the fact that Amanda has leapt ahead to the finish line, I shall continue to plod along because, firstly, only by implementing the methodology can I be sure that I have properly and fully understood it and, secondly, as one discovers as one progresses in the field of quantitative research, fine details are often very important. So I am hoping that Amanda will provide additional guidance if I stray too far off piste in the forthcoming exposition.