When the base data is a collection of 40 million postings in a table that grows longer every day, distributed across 472,000 accounts on roughly 5,000 properties; when the use case is filtering, grouping, summing, and sorting these; and when all of this has to happen on-demand for users trying to keep track of what the real-time financial situation of their property is- that's when you realise that using the ORM for mere data access and working the data in Python might not be the ticket. And, when users suddenly want to compare their property to some of the 5,000 others, your Python-first approach will expose the users to a considerable wait which, frankly, can be a bit embarrassing. You don't want to be that guy.
I'm going to talk about how I pushed all this heavy lifting to Postgres using the Django ORM and (almost) no raw SQL - with techniques like chained annotations, window functions for cumulative summing, subqueries with OuterRef, conditional statements in the query, and, for extra fun, digging into JSONB-arrays because hey- there's a json blob describing how all of this data finally has to be sorted. I'll also be touching on the subject of materialized views. The benefits of this approach are not only dramatically better performance (what my friend said...) but also, super-readable and solid code.
If you're a Django developer working with large and growing datasets, or if you're curious how your app can benefit from the performance of Postgres without writing raw SQL, this talk is for you.