Shortest Path Analysis in Python

Python Project: Learn Python Shortest Path Analysis in Detail with Examples
Written by Moh Freelancer |10-May-2019 | 0 Comments | 126 Views

In this section, we are going to perform an analysis that will be using all the things that we have learned previously in Shortest Path Analysis. It will be using networkX skill, pandas skill and some newer form of visualization which we have not done yet.

So we are doing today the analysis of the shortest path of flight keeping in mind that no direct flight go from the source to direction and also keeping in view that No New York City airport has the capacity of taking a direct flight to the destination.

 

In this task, we first acquire the longitude and latitude of the countries and then we will create an extensive data set by calculating the distance from each of the countries.

So let's start the project:

import random
import csv
import geopy.distance
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="Airports Data")
import pickle
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import folium

SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

 

So, first of all, we have imported random to do some random jobs when we need to randomize something we will be using this module. We are going to create a CSV file that will store all the location or source and all the location of the destination with the distance from the source to destination.

 

Then there is a geolocator which will extract the information out of the name of the place it will provide the longitude and latitude of the place that is very important for this task.

Then there is a pickle to pickle the longitude and latitude, so we don't have to extract every time. Then there is networkX which we will be using to make the relation based on distance so that we can make the shortest path out of it. If anyone wants to plot networkX graph than it can be done using Matplotlib library and in last we have a folium. Folium is a new thing for you. It is a mapping technique in which we can plot markers and make lines on the world map.

 

Ok so after making all the significant imports we now the move to our first part building a data set that we can use to analyze:

loc = []
capital = ['Washington','Las Vegas','California','Beijing','Tehran',
           'Kabul','Baku','Dhaka','Cairo','Baghdad',
           'Tel Aviv','Astana','Kuwait City','Tripoli','Muscat','Riyadh','Abu Dhabi','Amman',
           'Moscow','London','New York City','Istanbul','Bangkok','Geneva','Colombo']
t = len(capital)
#for city in range(t):
    #location = geolocator.geocode(capital[city])
    #loc.append((location.latitude, location.longitude))
 
From = 'New York City'
To = 'London'
forbidden_from = ['New York City']
forbidden_to = ['London']
with open('lon_lat.pkl', 'rb') as f:
    city = pickle.load(f)
    capital=city[0]
    loc = city[1]
with open('large_data.csv', mode='w') as capital_distance:
    capital_writer = csv.writer(capital_distance, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    capital_writer.writerow(["Source", "Target", "Distance","s_lon","s_lat","t_lon","t_lat"])
    for i in range(t):
        for j in range(t):
            if (i!=j):
                source = loc[i]
                target = loc[j]
                if (not((capital[i] in forbidden_from and capital[j] in forbidden_to) or (capital[i] in forbidden_to and capital[j] in forbidden_from))):
                    distance = geopy.distance.vincenty(source,target).miles
                    distance = '%.2f' %distance
                    capital_writer.writerow([capital[i],capital[j],distance,source[0],source[1],target[0],target[1]])

SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

Ok so in the first line we initialize an empty list in which we are going to append longitude and latitude. We are not taking all of the countries we are taking some of the countries. In this case, we have taken 26 countries.

After this initialization, we start a loop that will iterate over all of the countries and extract the longitude and latitude of the countries.

We already said theta there should be no direct flight from New York to the destination so we forbidden New York City airport and after that, we run a loop to get the distance from each of the countries. This step doesn't include the distance from the New York City airport and the destination.

Distance is calculated using the geopy module and the unit is miles. We can also convert into kilometers by specifying km instead of miles. The CSV file looks like the following:

 

 527

New York City

Mumbai

7808.03

40.730862

-73.987156

18.938771

72.835335

528

New York City

Beijing

6842.98

40.730862

-73.987156

39.906217

116.391276

529

New York City

Tehran

6135.80

40.730862

-73.987156

35.700618

51.401378

530

New York City

Kabul

6748.57

40.730862

-73.987156

34.526013

69.177648

531

New York City

Baku

5830.62

40.730862

-73.987156

40.375443

49.832675

532

New York City

Dhaka

7881.95

40.730862

-73.987156

23.759357

90.378814

533

New York City

Cairo

5616.46

40.730862

-73.987156

30.048819

31.243666

534

New York City

Baghdad

6004.45

40.730862

-73.987156

33.302431

44.378799

535

New York City

Tel Aviv

5674.81

40.730862

-73.987156

32.080481

34.780527

536

New York City

Astana

5771.76

40.730862

-73.987156

51.128258

71.430550

537

New York City

Kuwait City

6347.83

40.730862

-73.987156

29.379709

47.973563

538

New York City

Tripoli

4660.58

40.730862

-73.987156

32.896672

13.177792

539

New York City

Muscat

7066.25

40.730862

-73.987156

23.512550

58.551402

540

New York City

Riyadh

6546.47

40.730862

-73.987156

24.631969

46.715065

541

New York City

Abu Dhabi

6856.08

40.730862

-73.987156

23.997644

53.643910

542

New York City

Amman

5729.39

40.730862

-73.987156

31.951569

35.923963

543

New York City

Moscow

4678.28

40.730862

-73.987156

55.750718

37.617661

544

New York City

Istanbul

5024.81

40.730862

-73.987156

41.009633

28.965165

545

New York City

Bangkok

8667.21

40.730862

-73.987156

13.753893

100.816080

546

New York City

Geneva

3871.64

40.730862

-73.987156

46.201756

6.146601

547

New York City

Colombo

8757.31

40.730862

-73.987156

6.921812

79.865561

 

It contains the source and destination name along with their coordinates.

with open('lon_lat.pkl', 'rb') as f:
    city = pickle.load(f)
    capital=city[0]
    loc = city[1]
G = nx.Graph()
df = pd.read_csv('large_data.csv')
From = 'New York City'
To= 'London'
city = df.Source.unique()
G.add_nodes_from(city)
data = df[['Source', 'Target', 'Distance']]
edge = list(map(tuple,data.values))
for source,target,distance in edge:
    G.add_edge(source, target,edge_color='g', weight=distance)
path = nx.dijkstra_path(G, From,To)
print(path)

SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

We open the file that we have pickled to get the city and its longitude and latitude for creating the markers later. After that, we have initialized the networkX graph and make the edges and nodes of all cities and their respective distance from each other. After that, we call the method of shortest path and print the shortest path. The output is as follows:

['New York City', 'Geneva', 'London']

 

So according to it the shortest path is from New York City to Geneva and then to London.

Now we have the shortest path lets plot on folium.

a = list(nx.all_shortest_paths(G,From,To))
short={}
for e in a:
    total_d=0.0
    for i in range(len(e)-1):
        j = df[(df.Source == e[i])&(df.Target == e[i+1])].index
        dist = float((df.iloc[j,]['Distance']))
        total_d = total_d+dist
    short[tuple(e)]= total_d

SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

However, before plotting the map, we will get all the shortest path from the networkX all_shortet_path module to see other alternatives too. After getting all the shortest path, we iterate over the list and calculate the total distance from the source to the 2nd city and then the destination and store it in a dictionary.

To plot the shortest path on folium and afterward, we will output all the shortest paths.

So, the code for this is as follows:

import folium
lon = []
lat = []
for i in range(len(loc)):
    lon.append(loc[i][0])
    lat.append(loc[i][1])
data = pd.DataFrame({
'lat':lat,
'lon':lon,
'name':capital
})
m = folium.Map(location=[20, 0], tiles="Mapbox Bright", zoom_start=2)
for i in range(0,len(data)):
    folium.Marker([data.iloc[i]['lon'], data.iloc[i]['lat']], popup=data.iloc[i]['name']).add_to(m)
   
color = ['red','green','blue','yellow','black','brown']
 
 
 
for i in range(len(path)-1):
    j = df[(df.Source == path[i])&(df.Target == path[i+1])].index
    lon1 = float((df.iloc[j,]['s_lon']))
    lat1 = float((df.iloc[j,]['s_lat']))
   
    lon2 = float((df.iloc[j,]['t_lon']))
    lat2 = float((df.iloc[j,]['t_lat']))
   
    dist = float((df.iloc[j,]['Distance']))
    folium.PolyLine(locations=[[lon1,lat1], [lon2,lat2]],popup=str(dist)+"miles",color=random.choice(color)).add_to(m)
 
m.save('map.html')
m

SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

After running this script following map will be saved as an HTML:

plot the shortest path on folium in python

Figure 1

 

It shows all the cities specified by the marker and the polyline from Islamabad to Kabul and then London.

Now let’s see how we can get the all shortest paths:

print("All possible routes are: ")
sorted_by_value = sorted(short.items(), key=lambda kv: kv[1])
for e in sorted_by_value:
    print(e)

SourceCode Credit: Hassan Rehman, Shortest Path Analysis, (2019), GitHub repository, https://github.com/HassanRehman11/Shortest-Path-Analysis/blob/master/Map.ipynb

To print all the shortest paths, we first sort the dictionary with respect the distance in descending order. The output is as follows:

All possible routes are:

(('Islamabad', 'Kabul', 'London'), 3778.01)

(('Islamabad', 'Baku', 'London'), 3827.1400000000003)

(('Islamabad', 'Moscow', 'London'), 3830.11)

(('Islamabad', 'Tehran', 'London'), 3969.19)

(('Islamabad', 'Istanbul', 'London'), 4003.79)

(('Islamabad', 'Geneva', 'London'), 4008.06)

(('Islamabad', 'Astana', 'London'), 4180.9400000000005)

(('Islamabad', 'Baghdad', 'London'), 4190.610000000001)

(('Islamabad', 'Kuwait City', 'London'), 4385.73)

(('Islamabad', 'Amman', 'London'), 4409.83)

(('Islamabad', 'Tel Aviv', 'London'), 4421.18)

(('Islamabad', 'Cairo', 'London'), 4630.96)

(('Islamabad', 'Abu Dhabi', 'London'), 4739.13)

(('Islamabad', 'Muscat', 'London'), 4746.31)

(('Islamabad', 'Riyadh', 'London'), 4773.26)

(('Islamabad', 'Tripoli', 'London'), 4857.93)

(('Islamabad', 'Mumbai', 'London'), 5490.7300000000005)

(('Islamabad', 'Dhaka', 'London'), 6234.25)

(('Islamabad', 'Colombo', 'London'), 7307.17)

(('Islamabad', 'Beijing', 'London'), 7497.14)

(('Islamabad', 'Bangkok', 'London'), 8164.51)

(('Islamabad', 'New York City', 'London'), 10366.869999999999)

 

 

Mentioned above is the analysis of the shortest path. This code contains many areas to improve it which is intentionally designed in such a way so that you improve it by yourself. The task is to make it more dynamic less with less loop and how we can improve the efficiency and timing of this code.



Login/Sign Up

Comments




Related Posts



© Copyright 2019, All Rights Reserved. paayi.com

This site uses cookies. By continuing to use this site or clicking "I Agree", you agree to the use of cookies. Read our cookies policy and privacy statement for more information.