## Python Tutorials: Shortest Path Analysis in Python

##### Python Tutorials: Learn Python Shortest Path Analysis in Detail with Examples
Written by |17-Oct-2020 | 0 Comments | 762 Views

In this section, we are going to perform an analysis that will be using all the things that we have learned previously in Shortest Path Analysis. It will be using networkX skill, pandas skill, and some newer form of visualization, which we have not done yet.

So we are doing today the analysis of the shortest path of flight, keeping in mind that no direct flight goes from the source to direction and also keeping in view that No New York City airport has the capacity of taking a direct flight to the destination.

In this task, we first acquire the longitude and latitude of the countries, and then we will create an extensive data set by calculating the distance from each of the countries.

So let's start the project:

 import random import CSV import geopy.distance from geopy.geocoders import Nominatim geolocator = Nominatim(user_agent="Airports Data") import pickle import networkx as nx import pandas as pd import matplotlib.pyplot as plt import NumPy as np import folium SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

So, first of all, we have imported a random module to do some random jobs; when we need to randomize something, we will be using this module. We are going to create a CSV file that will store all the locations/sources and all the locations of the destination with the distance from the source to the destination.

Then there is a geolocator that will extract the information out of the name of the place; it will provide the longitude and latitude of the place that is very important for this task.

Then there is a pickle to pickle the longitude and latitude, so we don't have to extract every time. Then there is networkX, which we will be using to make the relation based on distance so that we can make the shortest path out of it. If anyone wants to plot a networkX graph, then it can be done using the Matplotlib library, and in last, we have a folium. Folium is a new thing for you. It is a mapping technique in which we can plot markers and make lines on the world map.

Ok so after making all the significant imports we now the move to our first part building a data set that we can use to analyze:

 loc = [] capital = ['Washington','Las Vegas','California','Beijing','Tehran',            'Kabul','Baku','Dhaka','Cairo','Baghdad',            'Tel Aviv','Astana','Kuwait City','Tripoli','Muscat','Riyadh','Abu Dhabi','Amman',            'Moscow','London','New York City','Istanbul','Bangkok','Geneva','Colombo'] t = len(capital) #for city in range(t):     #location = geolocator.geocode(capital[city])     #loc.append((location.latitude, location.longitude))   From = 'New York City' To = 'London' forbidden_from = ['New York City'] forbidden_to = ['London'] with open('lon_lat.pkl', 'rb') as f:     city = pickle.load(f)     capital=city[0]     loc = city[1] with open('large_data.csv', mode='w') as capital_distance:     capital_writer = csv.writer(capital_distance, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)     capital_writer.writerow(["Source", "Target", "Distance","s_lon","s_lat","t_lon","t_lat"])     for i in range(t):         for j in range(t):             if (i!=j):                 source = loc[i]                 target = loc[j]                 if (not((capital[i] in forbidden_from and capital[j] in forbidden_to) or (capital[i] in forbidden_to and capital[j] in forbidden_from))):                     distance = geopy.distance.vincenty(source,target).miles                     distance = '%.2f' %distance                     capital_writer.writerow([capital[i],capital[j],distance,source[0],source[1],target[0],target[1]]) SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

Ok, so in the first line, we initialize an empty list in which we are going to append longitude and latitude. We are not taking all of the countries; we are taking some of the countries. In this case, we have taken 26 countries.

After this initialization, we start a loop that will iterate over all of the countries and extract the longitude and latitude of the countries.

We already said theta there should be no direct flight from New York to the destination, so we have forbidden New York City airport, and after that, we run a loop to get the distance from each of the countries. This step doesn't include the distance from the New York City airport and the destination.

Distance is calculated using the geopy module, and the unit is miles. We can also convert into kilometers by specifying km instead of miles. The CSV file looks like the following:

 527 New York City Mumbai 7808.03 40.7309 -73.9872 18.9388 72.8353 528 New York City Beijing 6842.98 40.7309 -73.9872 39.9062 116.391 529 New York City Tehran 6135.8 40.7309 -73.9872 35.7006 51.4014 530 New York City Kabul 6748.57 40.7309 -73.9872 34.526 69.1776 531 New York City Baku 5830.62 40.7309 -73.9872 40.3754 49.8327 532 New York City Dhaka 7881.95 40.7309 -73.9872 23.7594 90.3788 533 New York City Cairo 5616.46 40.7309 -73.9872 30.0488 31.2437 534 New York City Baghdad 6004.45 40.7309 -73.9872 33.3024 44.3788 535 New York City Tel Aviv 5674.81 40.7309 -73.9872 32.0805 34.7805 536 New York City Astana 5771.76 40.7309 -73.9872 51.1283 71.4305 537 New York City Kuwait City 6347.83 40.7309 -73.9872 29.3797 47.9736 538 New York City Tripoli 4660.58 40.7309 -73.9872 32.8967 13.1778 539 New York City Muscat 7066.25 40.7309 -73.9872 23.5126 58.5514 540 New York City Riyadh 6546.47 40.7309 -73.9872 24.632 46.7151 541 New York City Abu Dhabi 6856.08 40.7309 -73.9872 23.9976 53.6439 542 New York City Amman 5729.39 40.7309 -73.9872 31.9516 35.924 543 New York City Moscow 4678.28 40.7309 -73.9872 55.7507 37.6177 544 New York City Istanbul 5024.81 40.7309 -73.9872 41.0096 28.9652 545 New York City Bangkok 8667.21 40.7309 -73.9872 13.7539 100.816 546 New York City Geneva 3871.64 40.7309 -73.9872 46.2018 6.1466 547 New York City Colombo 8757.31 40.7309 -73.9872 6.92181 79.8656

It contains the source and destination names along with their coordinates.

 with open('lon_lat.pkl', 'rb') as f:     city = pickle.load(f)     capital=city[0]     loc = city[1] G = nx.Graph() df = pd.read_csv('large_data.csv') From = 'New York City' To= 'London' city = df.Source.unique() G.add_nodes_from(city) data = df[['Source', 'Target', 'Distance']] edge = list(map(tuple,data.values)) for source,target,distance in edge:     G.add_edge(source, target,edge_color='g', weight=distance) path = nx.dijkstra_path(G, From,To) print(path) SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

We open the file that we have pickled to get the city and its longitude and latitude for creating the markers later. After that, we have initialized the networkX graph and make the edges and nodes of all cities and their respective distance from each other. After that, we call the method of shortest path and print the shortest path. The output is as follows:

 ['New York City', 'Geneva', 'London']

So, according to it, the shortest path is from New York City to Geneva and then to London.

Now we have the shortest path lets plot on folium.

 a = list(nx.all_shortest_paths(G,From,To)) short={} for e in a:     total_d=0.0     for i in range(len(e)-1):         j = df[(df.Source == e[i])&(df.Target == e[i+1])].index         dist = float((df.iloc[j,]['Distance']))         total_d = total_d+dist     short[tuple(e)]= total_d SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

However, before plotting the map, we will get all the shortest path from the networkX all_shortet_path module to see other alternatives too. After getting all the shortest path, we iterate over the list and calculate the total distance from the source to the 2nd city and then the destination and store it in a dictionary.

To plot the shortest path on folium and afterward, we will output all the shortest paths.

So, the code for this is as follows:

 import folium lon = [] lat = [] for i in range(len(loc)):     lon.append(loc[i][0])     lat.append(loc[i][1]) data = pd.DataFrame({ 'lat':lat, 'lon':lon, 'name':capital }) m = folium.Map(location=[20, 0], tiles="Mapbox Bright", zoom_start=2) for i in range(0,len(data)):     folium.Marker([data.iloc[i]['lon'], data.iloc[i]['lat']], popup=data.iloc[i]['name']).add_to(m)     color = ['red','green','blue','yellow','black','brown']       for i in range(len(path)-1):     j = df[(df.Source == path[i])&(df.Target == path[i+1])].index     lon1 = float((df.iloc[j,]['s_lon']))     lat1 = float((df.iloc[j,]['s_lat']))         lon2 = float((df.iloc[j,]['t_lon']))     lat2 = float((df.iloc[j,]['t_lat']))         dist = float((df.iloc[j,]['Distance']))     folium.PolyLine(locations=[[lon1,lat1], [lon2,lat2]],popup=str(dist)+"miles",color=random.choice(color)).add_to(m)   m.save('map.html') m SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

After running this script following map will be saved as an HTML:

 Figure 1

It shows all the cities specified by the marker and the polyline from Islamabad to Kabul and then London.

Now let’s see how we can get the all shortest paths:

 print("All possible routes are: ") sorted_by_value = sorted(short.items(), key=lambda kv: kv[1]) for e in sorted_by_value:     print(e) SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

To print all the shortest paths, we first sort the dictionary with respect the distance in descending order. The output is as follows:

Mentioned above is the analysis of the shortest path. This code contains many areas to improve it, which is intentionally designed in such a way so that you improve it by yourself. The task is to make it more dynamic less with less loop and how we can improve the efficiency and timing of this code.