While everyone is joking about their Spotify Wrapped or similar recaps, this hits on a serious point about data:
Creating a meaningful, insightful, and ultimately useful summary of a dataset is tougher than it seems on the surface.
This goes beyond summarization with LLMs/genAI. (That has proven pretty rocky, but non-LLM summaries are messy in their own way.)
Reducing a dataset to its essence means losing detail, which means being picky about what you want to show and why.
When summarizing a dataset, it helps to ask:
what does the recipient want to know?
will I present something novel and/or actionable? Something they hadn't noticed before?
do they even want this information? (See: "Memories" apps that resurface painful life events)
And so on.
It's tougher to sort that out for mass-market summaries like Wrapped. (Still -- "do they want this?" should always be top of mind.)
But when you are creating a custom summary or analysis or similar data product, you certainly can and should ask.
(I originally posted this to Bluesky yesterday https://bsky.app/profile/qethanm.bsky.social/post/3lcitgpf7hc2z . Feel free to follow me there!)
Complex Machinery 023: We'll fix it in post
The latest issue of Complex Machinery: Correcting an LLM after it's already out in the wild.
Big names that should know better
It's disappointing to see certain companies' AI efforts stumble