Professional Documents
Culture Documents
in Toronto, Canada
Gaston Slupski
1. Introduction
1.1. Background
Let’s assume that a fast food chain called “Wenzy’s” (yeah, it’s similar to Wendy’s, but different
owners) is coming to town, here in Toronto, Canada. The management is wondering where they
should open their first restaurant. They need a place that will generate fast profits, without
worrying about their marketing costs in advertising and retention so much.
For this, we need to understand these 3 different requirements, where the location:
1.3. Justification
The interest of this project is for the fast food chain’s business to be successful in the city. If their
first location succeeds with high profits, then the business would be able to open new restaurants
around the city and start building their reputation in Toronto, Canada.
1
2. Data collection and data requirements
In order to develop the project, we would need to collect different kind of data from various
sources. Between this data, we would need:
At first, we parsed through the Wikipedia article, to retrieve the relevant list of neighborhoods
from Toronto, Canada. This was done by using the module ‘BeautifulSoap’. Once the article was
parsed, the data was arranged in a table with 3 different columns: 'PostalCode', 'Borough',
'Neighborhood'. Afterwards, the Cognitive Class’ list on coordinates was downloaded (in csv
format). Moreover, the coordinates (columns: ‘Latitude’ and ‘Longitude’) where added to the
previous neighborhoods’ list. After this, it was downloaded from the Foursquare’s API a list of
100 venues of Toronto, Canada, and merged into the previous neighborhoods’ list, matching every
venue within their relevant neighborhood. Once finalized, there was one big table with 8 different
columns (including the counter).
2
3. Exploratory Data Analysis
With the dataset finalized, it was proceeded to analyze certain elements of the data itself.
First of all, having a size of 2251 rows and 7 columns within our database, it was proceeded to
count the number of venues per neighborhood. Once this was displayed, we count the unique
number of venues’ categories in the dataset; having 279 unique categories.
With this preliminary numbers, all the categories where grouped by neighborhoods, and it was
taken the mean of the frequency of occurrences of each category. This mean that for each
neighborhood, we had displayed all the venues’ categories with their relevant average frequency.
Once this table was done, it was looked the top 5 venues’ categories by occurrence frequency for
each neighborhood. This gave us a better understanding that we could create a new dataset
displaying the top 10 venues by neighborhood. This final dataset will serve us in the modeling
with a cluster’s algorithm.
3
4. Modelling: clusters
4.1. Clustering
For the modelling section, it was used a clustering in order to differentiate different clusters where
neighborhoods in Toronto, Canada, are similar with different parameters. There were defined 10
clusters in the model and labeled each cluster (from 0 to 9).
Moreover, it was created a new dataset with the original final version of the dataset (see Table 1)
and added the top 10 most frequented venues’ categories and the relevant cluster label for each
neighborhood.
Once the clusters were set on the new dataset, it was created a new map of the neighborhoods in
Toronto, Canada, distributed by the different clusters (marked in different colors).
4
With this new map, it was analyzed each cluster composed by their relevant venues’ categories.
We could identify each cluster has a different trait of venues that were related to specific
environments. For example, we saw that in cluster 4, the most common venue was banking,
followed by women’s stores and parks. We could assume then that these neighborhoods within
the cluster 4 are oriented to residential areas. In addition, cluster 6 had the most common venue
as parks, followed by other green areas (i.e. trails) and few restaurants. Also, residential areas we
can guess.
On the other hand, cluster 3 presented their most common venue as pizza places, followed by
grocery stores and pharmacies. We could assume here that these neighborhoods are younger-
more oriented, while also residentials. Moreover, cluster 1 presented with their most common
venue as coffee shops, followed by restaurants. Here, we could assume that these neighborhoods
are more commercial-oriented.
5. Conclusions
Once we had all the clusters analyzed, we can conclude that the 1st restaurant “Wenzy’s” in
Toronto should have a location in any of the neighborhoods within cluster 1. These neighborhoods
offer a variety of shops and offices, having a high dense-population within them. In addition, there
are multiple restaurants representing an opportunity to distinguish the brand of “Wenzy’s” with
usual commensals in the area. And moreover, there isn’t a high percentage of fast food restaurants
in these neighborhoods, meaning a low competition in the area.
Now the management of “Wenzy’s” needs to start their own research on finding a profitable
building to start building their 1st fast food restaurant in Toronto. Good luck to them!