My article "Opening the Stage Door for Big Data in Broadway — Building Databases from unstructured text using Machine Learning."
I've spent a good portion of my summer devising a complex machine learning algorithm which extracts data for all Broadway shows, ever. This article summarizes my approach, and provides explanations, pitfalls, and insights. Also, the code for this algorithm is available through the article!
Scrapped data, no statistical analysis conducted.
So far this summer, I've been researching the mathematics and data science of dynamic modelling. My goal is to develop a model for maximization of revenue and attendance for Broadway shows. Articulated in the proposal below are the speculated means for which I hope to accomplish this undertaking.
– Give it a read?
– Let me know what you think?
– Feedback or ideas are warmly welcome!
The following interactive graph describes the revenue performance of last week's Broadway shows. Try clicking on the graph and see what comes of it. This is a learning experience for me so I appreciate all feedback you can provide! Send me an email if something stands out: firstname.lastname@example.org
Weekly Gross versus Capacity (Week of July 1, 2018
The x axis describes the percentage of occupied of seats for a production. A value of "80" would connote that the production full-filled 80% of their seats. A score above 100% is achieved by selling standing room.
The y axis describes the weekly gross of a production for that week's performance. This includes all sales for that week's performances, regardless of when tickets were purchased. For example, if someone purchased tickets 2 months ago for a show within this week, that sale would be included in this week's data set.
The size of each point describes total attendance of that data point for the week. Of note, the largest (grey) point describes Harry Potter and the Cursed Child.
Some extra info: