Commercial Relations Of Colombia In The Last 20 Years#

The main idea of this notebook is to explore in a fast way the develop of the commercial relations of colombia during the las 20 years using python tools.

First, we are going to extract data from the TradeMap website on Colombian imports and exports. So we’re going to turn some kind of time series data into panel data to make it easier to explore.

# Data structures
import pandas as pd 

# Data graphic visualization
import seaborn as sns 
import matplotlib.pyplot as plt
from matplotlib import rcParams

# Extra Modules 
import warnings
warnings.filterwarnings("ignore")

Restructuring the data#

FROM THIS…#

imports = pd.read_table('https://raw.githubusercontent.com/BautistaDavid/Machine-Learning-Class-2022/main/data/Ttrade_map_imports_colombia.txt')
exports = pd.read_table('https://raw.githubusercontent.com/BautistaDavid/Machine-Learning-Class-2022/main/data/trade_map_exports_colombia.txt')

imports.drop(columns = 'Unnamed: 21', inplace = True)
exports.drop(columns = 'Unnamed: 21', inplace = True)

imports.columns = ['country']+ list(range(2001,2021))
exports.columns = ['country']+ list(range(2001,2021))

imports = imports.set_index('country')
exports = exports.set_index('country')

# imports = imports.drop(0).reset_index().drop(columns = 'index').set_index('country')
# exports = exports.drop(0).reset_index().drop(columns = 'index').set_index('country')

exports.head(2)

	2001	2002	2003	2004	2005	2006	2007	2008	2009	2010	2011	2012	2013	2014	2015	2016	2017	2018	2019	2020
country
World	12301486	11897488	13092218	16729678	21190439	24390975	29991332	37625882	32852986	39819529	56953516	60273618	58821870	54794812	35690767	31095444	37770123	41769699	39496225	31045676
United States of America	5344533	5328470	6160245	7042201	8851631	9948230	10609167	14288833	13123466	17143277	21948535	22216238	18692895	14470697	10052617	10215894	10975962	11080579	12265471	9442028

imports.head(2)

	2001	2002	2003	2004	2005	2006	2007	2008	2009	2010	2011	2012	2013	2014	2015	2016	2017	2018	2019	2020
country
World	12820352	12689965	13880613	17099537	21204162	26162440	32897045	39668840	32897671	40682508	54674822	58087854	59381197	64027610	54057599	44889367	46075705	51232805	50413025	43488662
United States of America	4442775	4051198	4112710	4806760	6033241	6960605	8631115	11583019	9505791	10531813	13663338	14140116	16428423	18256281	15583375	11946833	12096143	13075918	12832603	10634174

TO THIS…#

def restructuring_data(relation: str):
  panel_ = pd.DataFrame(columns = ['country','year','relation','amount'])
  countrys, years, amounts = [], [], []

  if relation == 'imports':
    for row in imports.iterrows():
      country, data = row[0], row[1]
      for _,i in enumerate(data):
        countrys.append(country)
        years.append(2001 + _)
        amounts.append(data[2001 + _])
  else:
    for row in exports.iterrows():
      country, data = row[0], row[1]
      for _,i in enumerate(data):
        countrys.append(country)
        years.append(2001 + _)
        amounts.append(data[2001 + _])

  panel_['country'] = countrys
  panel_['year'] = years
  panel_['amount'] = amounts
  panel_['relation'] = relation
  return panel_ 

panel = pd.concat([restructuring_data('imports'),restructuring_data('exports')])

panel.head(5)

	country	year	relation	amount
0	World	2001	imports	12820352
1	World	2002	imports	12689965
2	World	2003	imports	13880613
3	World	2004	imports	17099537
4	World	2005	imports	21204162

Let’s talk about imports#

Below you can see the annual behavior of Colombian imports from 2001 to 2020. There are also some interesting maximums points during this time series.

total = panel[(panel['country'] == 'World') & (panel['relation'] =='imports')]
total['year'] = pd.to_datetime(total['year'], format='%Y')

fig = plt.figure(figsize = (10,5))
plt.title('Below you can see the annual behavior of Colombian imports from 2001 to 2020.\
 There are\nalso some interesting maximums points during this time serie.', loc = 'left', y=0.98,fontsize=12)
plt.grid(axis = 'x')
fig.suptitle('Colombian Imports 2001 -2020', y=1.07, fontsize=16)
ax1 = fig.add_subplot(1,1,1)  # se crea un eje pero la mayoria de comandos funciona con normalidad de plt
ax1.fill_between(total['year'], total['amount'] /1000000,alpha=0.1,color="blue")
ax1.plot_date(total['year'],total['amount'] /1000000, color = 'blue',marker = 'x', linestyle = '-.', label = 'Colombian Imports')    
ax1.set_ylabel('USD Thousands of Millions', fontsize = 15)
ax1.set_xlabel('Year' ,fontsize = 15)
ax1.axis(ymin=0, ymax=80, xmin = pd.Timestamp('2000-10-01') )

plt.axvline(pd.Timestamp('2008-01-01'), color='black', ymax = 0.48, linestyle ='--', label = 'maximums of the serie')
plt.axvline(pd.Timestamp('2014-01-01'), color='black', ymax = 0.80, linestyle ='--')
plt.axvline(pd.Timestamp('2018-01-01'), color='black', ymax = 0.62, linestyle ='--')
plt.xticks(rotation = 30)
plt.legend()
fig.show()

So, there are some years when the amount of imports goes down compared to the following year. Now let’s explore the main partner countries of Colombian imports during these years. Maybe looking that information we can talk about why the imports amounts goes down in some years.

To facilitate this analysis we are going to choose the five main commercial partners of Colombia related to imports in 2020 and then we are going to analyze their behavior since 2001

five_imp = panel[(panel['relation'] == 'imports') & (panel['year'] == 2020)].sort_values(by = 'amount', ascending = False).reset_index().loc[1:5]
five_imp

	index	country	year	relation	amount
1	39	United States of America	2020	imports	10634174
2	59	China	2020	imports	10398882
3	79	Mexico	2020	imports	2925827
4	99	Brazil	2020	imports	2435489
5	119	Germany	2020	imports	1643891

data_five_imp = panel[((panel['country'] == 'United States of America') | (panel['country'] == 'China') | 
                     (panel['country'] == 'Mexico') | (panel['country'] == 'Brazil') | (panel['country'] == 'Germany'))]

data_five_imp = data_five_imp[data_five_imp['relation'] == 'imports']

data_five_imp['amount'] = data_five_imp['amount'] / 1000000

Below we can see an applied bar graph, where the 5 countries that Colombia imported the most during 2020 participate. The evolution of the participation of these 5 countries since 2001 is shown.

The two following graphs have the same idea, however, in one of them the bars for each year are presented consecutively, to facilitate the comparison of magnitudes. The last graph is simply the time series of the participation of these countries in Colombian imports from 2001 to 2002

(data_five_imp.groupby(['country','year'])['amount'].sum().reset_index()
.pivot(index ='year', values ='amount', columns = 'country')
.plot(kind = 'bar', stacked = True, figsize = (16,5),\
      title = 'Imported amount of the five main importing partners currently\n in Colombia from 2001 to 2020', ylabel ='USD Thousands of Millions'))

<AxesSubplot:title={'center':'Imported amount of the five main importing partners currently\n in Colombia from 2001 to 2020'}, xlabel='year', ylabel='USD Thousands of Millions'>

plt.figure(figsize=(20,6))
sns.set(style='whitegrid', rc={"grid.linewidth": 0.1})  # Seaborn Style
sns.set_context("notebook", font_scale=1.25)            # More Seaborn Style
rcParams['axes.titlepad'] = 20                          # set a space between  the title and the figure
plt.xticks(rotation=45)
plt.ylabel('USD Thousands of Millions')
plt.title('Imported amount of the five main importing partners currently\n in Colombia from 2001 to 2020')
sns.barplot(data = data_five_imp, x = 'year', y ='amount', hue = 'country')

<AxesSubplot:title={'center':'Imported amount of the five main importing partners currently\n in Colombia from 2001 to 2020'}, xlabel='year', ylabel='amount'>

data = data_five_imp
data['year'] = pd.to_datetime(data['year'], format='%Y')

plt.figure(figsize = (15,7))

for i in list(data['country'].unique()):
  data_ = data[data['country'] == i]
  plt.plot(data_['year'],data_['amount'], label = i, marker = '.')
plt.grid(axis = 'x')
plt.xlim(left =pd.Timestamp('2000-10-01'))
plt.ylim(0,20)
plt.legend()
# plt.title('Imported amount of the five main importing partners currently\n in Colombia from 2001 to 2020')
plt.title('Import quantity of the top five import partners currently\n in Colombia from 2001 to 2020')
plt.xlabel('Year', fontsize = 16)
plt.ylabel('USD Thousands of Millions', fontsize = 16)
plt.xticks(rotation = 30)
plt.show()

The three previous graphs allow us to measure the importance of countries such as the United States and China for Colombian imports, since in all the periods these two represented a great magnitude of the total.

Now below you can see a heatmap that adds another 5 more countries to the analysis to measure the comparison of the imported magnitude of each country from 2001 to 2020

data_ten_imp = imports.drop('World').loc['United States of America':'Ecuador']

plt.figure(figsize = (15,7))
plt.title('HeatMap import quantity of the top ten import partners currently in Colombia')
heat_imp = sns.heatmap(data = data_ten_imp,cmap="YlGnBu")

It could be thought that this great difference between countries such as the United States and China is due to the large size of their economies, so we are going to try to generate an indicator that can show us a slightly more suitable measure to compare the countries. We are going to generate a variable that is the relationship between the amount imported in 2020 from that country over the 2020 population of that territory.

df_indicator = five_imp
df_indicator['population'] = [329770000,1414350000, 127792000, 212559409, 83155031]  # 2020 population of usa, china, mexico brazil germany
df_indicator['indicator'] = df_indicator['amount'] / df_indicator['population']
df_indicator  

	index	country	year	relation	amount	population	indicator
1	39	United States of America	2020	imports	10634174	329770000	0.032247
2	59	China	2020	imports	10398882	1414350000	0.007352
3	79	Mexico	2020	imports	2925827	127792000	0.022895
4	99	Brazil	2020	imports	2435489	212559409	0.011458
5	119	Germany	2020	imports	1643891	83155031	0.019769

plt.title('Indicator realtions amount imported / population')
heat_imp_ind = sns.heatmap(pd.pivot(df_indicator,index = 'country',values = 'indicator', columns = 'year'),annot = True)

Note that now there are changes in the ‘positions’ of the countries, the United States continues to present the highest ratio, however, due to its large population, China has the lowest indicator of these five countries

Now, about exports…#

We are going to repeat the same process of analysis and graphics but now with the exports

total_exp = panel[(panel['country'] == 'World') & (panel['relation'] =='exports')]
total_exp['year'] = pd.to_datetime(total['year'], format='%Y')

fig = plt.figure(figsize = (10,5))
plt.title('Below you can see the annual behavior of Colombian exports from 2001 to 2020.\
 There are\nalso some interesting maximums points during this time serie.', loc = 'left', y=0.98,fontsize=12)
plt.grid(axis = 'x')
fig.suptitle('Colombian Exports 2001 -2020', y=1.05, fontsize=16)
ax1 = fig.add_subplot(1,1,1)  # se crea un eje pero la mayoria de comandos funciona con normalidad de plt
ax1.fill_between(total_exp['year'], total_exp['amount'] /1000000,alpha=0.05,color="green")
ax1.plot_date(total_exp['year'],total_exp['amount'] /1000000, color = 'DarkGreen',marker = 'x', linestyle = '-.', label = 'Colombian Imports')    
ax1.set_ylabel('USD Thousands of Millions', fontsize = 15)
ax1.set_xlabel('Year' ,fontsize = 15)
ax1.axis(ymin=0, ymax=80, xmin = pd.Timestamp('2000-10-01') )

plt.axvline(pd.Timestamp('2008-01-01'), color='black', ymax = 0.48, linestyle ='--', label = 'maximums of the serie')
plt.axvline(pd.Timestamp('2012-01-01'), color='black', ymax = 0.75, linestyle ='--')
plt.axvline(pd.Timestamp('2018-01-01'), color='black', ymax = 0.52, linestyle ='--')
plt.xticks(rotation = 30)
plt.legend()
fig.show()

five_exp = panel[(panel['relation'] == 'exports') & (panel['year'] == 2020)].sort_values(by = 'amount', ascending = False).reset_index().loc[1:5]
five_exp

	index	country	year	relation	amount
1	39	United States of America	2020	exports	9442028
2	59	China	2020	exports	2670041
3	79	Panama	2020	exports	1510555
4	99	Ecuador	2020	exports	1473607
5	119	Brazil	2020	exports	1274000

data_five_exp = panel[((panel['country'] == 'United States of America') | (panel['country'] == 'China') | 
                     (panel['country'] == 'Mexico') | (panel['country'] == 'Brazil') | (panel['country'] == 'Germany'))]

data_five_exp = data_five_exp[data_five_exp['relation'] == 'exports']

data_five_exp['amount'] = data_five_exp['amount'] / 1000000

(data_five_exp.groupby(['country','year'])['amount'].sum().reset_index()
.pivot(index ='year', values ='amount', columns = 'country')
.plot(kind = 'bar', stacked = True, figsize = (17,6),\
      title = 'Exported amount of the five main Exporting partners currently\n in Colombia from 2001 to 2020', ylabel ='USD Thousands of Millions'))

<AxesSubplot:title={'center':'Exported amount of the five main Exporting partners currently\n in Colombia from 2001 to 2020'}, xlabel='year', ylabel='USD Thousands of Millions'>

plt.figure(figsize=(20,6))
sns.set(style='whitegrid', rc={"grid.linewidth": 0.1})  # Seaborn Style
sns.set_context("notebook", font_scale=1.25)            # More Seaborn Style
rcParams['axes.titlepad'] = 20                          # set a space between  the title and the figure
plt.xticks(rotation=45)
plt.ylabel('USD Thousands of Millions')
plt.title('Exported amount of the five main importing partners currently\n in Colombia from 2001 to 2020')
sns.barplot(data = data_five_exp, x = 'year', y ='amount', hue = 'country')

<AxesSubplot:title={'center':'Exported amount of the five main importing partners currently\n in Colombia from 2001 to 2020'}, xlabel='year', ylabel='amount'>

data = data_five_exp
data['year'] = pd.to_datetime(data['year'], format='%Y')

plt.figure(figsize = (15,7))

for i in list(data['country'].unique()):
  data_ = data[data['country'] == i]
  plt.plot(data_['year'],data_['amount'], label = i, marker = '.')
plt.grid(axis = 'x')
plt.xlim(left =pd.Timestamp('2000-10-01'))
plt.ylim(0,25)
plt.legend()
plt.title('Exported amount of the five main importing partners currently\n in Colombia from 2001 to 2020')
plt.xlabel('Year', fontsize = 16)
plt.ylabel('USD Thousands of Millions', fontsize = 16)
plt.xticks(rotation = 30)
plt.show()

data_ten_exp = exports.drop('World').loc['United States of America':'India']

plt.figure(figsize = (15,7))
sns.heatmap(data = data_ten_exp,cmap="YlGnBu")

<AxesSubplot:ylabel='country'>

df_indicator = five_exp
df_indicator['population'] = [329770000,1414350000,4279000, 17511000, 212559409]  # 2020 population of usa, china, panama, ecuador brazil 
df_indicator['indicator'] = df_indicator['amount'] / df_indicator['population']
df_indicator  

	index	country	year	relation	amount	population	indicator
1	39	United States of America	2020	exports	9442028	329770000	0.028632
2	59	China	2020	exports	2670041	1414350000	0.001888
3	79	Panama	2020	exports	1510555	4279000	0.353016
4	99	Ecuador	2020	exports	1473607	17511000	0.084153
5	119	Brazil	2020	exports	1274000	212559409	0.005994

plt.title('Indicator realtions amount exported / population')
heat_exp_ind = sns.heatmap(pd.pivot(df_indicator,index = 'country',values = 'indicator', columns = 'year'),annot = True)

In the case of exports, the strong dependence on the United States as a trading partner was much more noticeable, if China and the United States made up the majority of the total in imports, now only the American country represents a large magnitude of the total.

However, note that when we generate our ratio indicator between amount and population for the year 2020, Panama manages to lead with the highest magnitude (it must be taken into account that this country is significantly smaller than the others).

Now let’s take a quick look at the balance of trade#

Below are graphs and tables on the behavior of the country’s trade balance

import seaborn as sns

test = panel[panel['country'] == 'World']

plt.figure(figsize=(20,6))
sns.set(style='whitegrid', rc={"grid.linewidth": 0.1})  # Seaborn Style
sns.set_context("notebook", font_scale=1.25)            # More Seaborn Style
rcParams['axes.titlepad'] = 20                          # set a space between  the title and the figure
plt.xticks(rotation=45)
# plt.ylim(0,1000000)
plt.ylabel('fa')
balance1 = sns.barplot(data = test, x = 'year', y ='amount', hue = 'relation').set_title('Imports and Exports Colombia. 2001 -2020. USD Thousands of Millions')

test = panel[panel['country'] == 'World']

exp_w = test[test['relation'] == 'exports']
exp_w['year'] = pd.to_datetime(exp_w['year'], format='%Y')

imp_w = test[test['relation'] == 'imports']
imp_w['year'] = pd.to_datetime(imp_w['year'], format='%Y')

plt.figure(figsize = (10,5))
plt.title('Below you can see the annual behavior of Colombian exports and imports from 2001 to 2020.\
 \nThere are also some interesting apparent balance points during this time series.', loc = 'left', y=0.98,fontsize=12)
plt.suptitle('Balance of trade of Colombia 2001 -2020', y=1.05, fontsize=16)
plt.plot(exp_w['year'],exp_w['amount']/1000000, label = 'Exports', c = 'red', linestyle = '-.')
plt.plot(imp_w['year'],imp_w['amount']/1000000, label = 'Imports', c = 'blue', linestyle = '-.')
plt.fill_between(imp_w['year'],exp_w['amount']/1000000,imp_w['amount']/1000000,
                 alpha = 0.4,interpolate = True, label = 'Trade Balance')
plt.grid(axis ='x')
plt.xlabel('Year', fontsize = 15)
plt.ylabel('USD Thousands of Millions',fontsize = 15) 
plt.xlim(left =pd.Timestamp('2000-10-01'))
plt.xticks(rotation = 0)
plt.legend()
plt.show()

balance = pd.concat([exp_w[['year','amount']],imp_w['amount']], axis = 1)
balance.columns = ['year','exp','imp']
balance['year2'] = [i for i in range(2001,2021)]
balance['trade_balance'] = balance['exp'] - balance['imp']
balance['trade_balance'] = balance['trade_balance']/1000000

balance.head(5)

	year	exp	imp	year2	trade_balance
0	2001-01-01	12301486	12820352	2001	-0.518866
1	2002-01-01	11897488	12689965	2002	-0.792477
2	2003-01-01	13092218	13880613	2003	-0.788395
3	2004-01-01	16729678	17099537	2004	-0.369859
4	2005-01-01	21190439	21204162	2005	-0.013723

plt.figure(figsize = (10,5))
plt.xticks(rotation = 45)
trade = sns.barplot(x ='year2' ,data = balance, y = 'trade_balance').set_title('Trade Balance of Colombia 2001-2020 USD Thousands of Millions')

Reflections#

A quick review of the country’s import and export data during the 21st century shows that there is a certain dependence on countries such as the United States and China, both in terms of exports and imports.

On the other hand, it can be noted that Colombia has been in deficit in the trade balance for most of the century. A future analysis of both import and export products and services would allow us to explore more thoroughly than this what happens with these balances within the territory.

Economics, Datascience And Programing

Commercial Relations Of Colombia In The Last 20 Years

Contents