@elonmusk: So true Yun-Ta Tsai (@yunta_tsai) Many people think any given ML project is 99% training. In rea...

So true

Yun-Ta Tsai (@yunta_tsai)

Many people think any given ML project is 99% training.

In reality, it’s 50% evaluation, 40% data cleaning, 8% integration, and 2% training.

The first two set the noise floor for learning. No ML magic matters; the model cannot lower the noise floor, as that’s the optimal bound of Shannon encoding of your data.

Thus, not a single day goes by without me thinking about ontology. Even the old labels have to be constantly reviewed.

— https://nitter.net/yunta_tsai/status/2068364559698780520#m