Working with geospatial data on AWS Ubuntu

I’ve stumbled on different sorts of problems while working with geospatial data on cloud machine. AWS EC2 and Ubuntu sometimes require different setups. This is a quick note for installing GDAL on Ubuntu and how to transfer data from your local machine to your cloud machine without using S3.

To install GDAL

sudo -i
sudo add-apt-repository -y ppa:ubuntugis/ubuntugis-unstable
sudo apt update
sudo apt upgrade # if you already have gdal 1.11 installed
sudo apt install gdal-bin python-gdal python3-gdal # if you don't have gdal 1.11 already installed

To transfer data (SFTP) from your local machine to AWS EC2, you could use FileZilla.

Another option is using S3 with Cyberduck

To set up the environment, please refer to this post and this video.

How to use the online map tool for investing in sustainable rubber cultivation in tropical Asia

Please go ahead and play with the full-screen map here.

This map Application is developed to support the Guidelines for Sustainable Development of Natural Rubber, which led by China Chamber of Commerce of Metals, Minerals & Chemicals Importers & Exporters with supports from World Agroforestry Centre, East and Center Asia Office (ICRAF). Asia produces >90% of global natural rubber primarily in monoculture for highest yield in limited growing areas. Rubber is largely harvested by smallholders in remote, undeveloped areas with limited access to markets, imposing substantial labor and opportunity costs. Typically, rubber plantations are introduced in high productivity areas, pushed onto marginal lands by industrial crops and uses and become marginally profitable for various reasons.


Fig. 1. Rubber plantations in tropical Asia. It brings good fortune for millions of smallholder rubber farmers, but it also causes negative ecological and environmental damages.

The online map tool is developed for smallholder rubber farmers, foreign and domestic natural rubber investors as well as different level of governments.  

The online map tool entitled “Sustainable and Responsible Rubber Cultivation and Investment in Asia”, and it includes two main sections: “Rubber Profits and Biodiversity Conservation” and “Risks, SocioEconomic Factors, and Historical Rubber Price”.

The main user interface looks like the graph (Fig 2). There are 4 theme graphs and maps.

p1_section intro

Fig. 2. The main user interface of the online map tool.

. Section 1

This graph tells the correlation between “Minimum Profitable Rubber (USD/kg)” (the x-axis of the graph, and “Biodiversity (total species number)” in 2736 county that planted natural rubber trees in eight countries in tropical Asia.  There are 4312 counties in total, and in this map tool, we only present county that has the natural rubber cultivated.

p1_section intro_high

Fig. 3. How to read and use the data from the first graph. Each dot/circle represents a county, the color, and size of it indicates the area of natural rubber are planted. When you move your mouse closer to the dot, you will see “(2.34, 552) 400000 ha @ Xishuangbanna, China”, 2.34 is the minimum profitable rubber price (USD/kg), 552 is the total wildlife species including amphibians, reptiles, mammals, and birds.  “400000 ha” is the total area of planted natural rubber plantation from satellite images between 2010 and 2013. “@ Xishuangbanna, China” is the geolocation of the county. 

Don’t be shy, please go ahead and play with the full-screen map here. The minimum profitable rubber price is the market price for national standard dry rubber products that would help you to start makes profits. For example, if the market price of natural rubber is 2.0 USD/kg in the county your rubber plantation located, but your minimum profitable rubber price is 2.5 USD/kg means you will lose money by just producing rubber products. However, if your minimum profitable rubber price is 1.5 USD/kg means you will still make about 0.5 USD/kg profit from your plantation.

The county that has a lower minimum profitable price for natural rubber is generally going to make better rubber profit in the global natural rubber market. However, as scientists behind this research, we hope that when you rush to invest and plant rubber in a certain county, please also think about other risks, e.g. biodiversity loss, topographic, tropical storm, frost as well as drought risks. They are going to be shown later in this demonstration. 

p2_section intro_high.gif

Fig. 4.  The first map is the “Rubber Cultivation Area”, which shows the each county that has rubber trees from low to high in colors from yellow to red. The second map “Minimum Profitable Rubber Price”(USD/kg), again the higher the minimum profitable price is the fewer rubber profits that farmers and investors are going to receive. The third map is ” Biodiversity (Amphibians, Reptiles, Mammals, and Birds)”,  data was aggregated from IUCN-Redlist and BirdLife International.

. Section 2

We also demonstrated different types of risks that investors and smallholder farmers would face when they invest and plant rubber trees. Rubber tree doesn’t produce rubber latex before 7 years old, and the tree owners won’t make any profit until the tree is around 10 years old in general. In this section, we presented “Topographic Risk”, ” Tropical Storm”, “Drought Risk”,  and “Frost Risk”.

p3_section intro_high.gif

Fig. 5. Section 2 ” Risks, SocioEconomic Factors and Historical Rubber Price” has seven different theme maps and interactive graphs. They are “Topographic Risk”, ” Tropical Storm”, “Drought Risk”,  and “Frost Risk”, “Average Natural Rubber Yield (kg/ha.year)”, “Minimum Wage for the 8 Countries (USD/day)”, and ” 10 years Rubber price”.

If you are interested in how the risk theme maps were produced, Dr. Antje Ahrends and her other coauthors have a peer-reviewed article published in Global Environmental Change in 2015.  “Average Natural Rubber Yield (kg/ha.year)” and “Minimum Wage for the 8 Countries (USD/day)” dataset was obtained from  International Labour Organization (ILO, 2014)  and FAO.” 10 years Rubber price” was scraped from  IndexMudi Natural Rubber Price.

Dr. Chuck Cannon and I are wrapping up a peer-reviewed journal article to explain the data collection, analysis, and policy recommendations based on the results, and we will share the link to the article once it’s available. Dr. Xu Jianchu and Su Yufang have shaped and provided guidance to shape the online map tool development. We could not gather the datasets and put insights to see how we could cultivate, manage, and invest in natural rubber responsibly without other scientists and researchers study and contribute to field for years. We appreciated Wildlife Conservation Society, many other NGOs and national department of rubber research in Thailand and Cambodia for their supports during our field investigation in 2015 and 2016.

We have two country reports for natural rubber in Thailand, and natural rubber and land conflict in Cambodia, a report support this online map tool is finalizing and we will share the link soon when it’s ready.


Technical sides

The research and analysis were done in R, and you could find my code here.

The visualization is purely coded in R too, isn’t R is such an awesome language? You could see my code for the visualization here.

To render geojson format of multi-polygon, you should use:

county_json_simplified <- ms_simplify(<your geojson file>)

My original geojson for 4000+ county weights about 100M but this code have help to reduce it to 5M, and it renders much faster on

I learnt a lot from this blog on manipulating geojson with R and another blog on using flexdashboard in R for visualization. Having an open source and general support from R users are great.

location, location, more locations: location intelligence and geocoding for growing the business

I’ve been browsing through the job broads too much recently. Yes, TOO MUCH, which makes me so anxious and angry sometimes. The employers out there just wanna you to do everything. Only the GIS job kinda things I am really interested in now, the employers want me to use ArcGIS for years, know all the spatial analysis/statistics, and also know open source data sources, and different satellite images processing—OK, I could do that. But I also need to be able to code via C++, Python, Jave, CSS and HTML, AND if I know the popular statistical and mathematical tools, like SAS, STAT, R, or MATLAB is plus. What about you could also use Adobe illustrator to make the most awesome maps and you better speak second and third foreign languages. The essential duties for the job position are …. a list of 20 duties, and requirements… another 30 of them and additionally… you must have xxx years of social, economic and environmental science related work experiences in Africa, Asia and South America…. I get it, I am never gonna be a good candidate. But, employers out there, come on… you don’t need a technical slaver, or Mr./Ms-knows-everything, you need the employee who can learn and wanna learn, and who can really evolve with your business and passionate about the job you give to them. When the employers refuse to give you the job offers that they are also so confident that they are gonna find someone right fit the position very soon, which terrified me the most. YOU ARE NOT THE BEST— that is the message I got everyday while I’m browsing through the job boards!

Back to the location intelligence. Business people, enterprise and industries leaders out there have grown their interests on analyzing your shopping behavior, habits and locations. Yes, it all about us. U.S Census bureau has launched two programs about the location analysis/intelligence for small business people who wanna start they own business, one is called country business pattern and another one is ZIP code business pattern.It aims to help small business people. The data are from 1998 to 2013, I never have a chance to use them but it could be super cool to dig out the information and pattern through the data. Future business starters would need more and more of this kinda information. From my own opinion is that: firstly, the business pattern would help you to analyze or map out the similar business you wanna run out there in your town, county or even nationally; secondly, the ZIP code business pattern could do the similar thing like business pattern analysis , but the ZIP code could also be used to analyze your potential customers’ behavior, race and so on, which means just map out your potential customers largely; the last step could be the real location analysis/intelligence, which would help you to analyse where is the best location to build/start your business, to avoid the potential business competitors but target to a bigger group of future customers. It’s certainly a mixing of information science, spatial analysis, statistics….

I only know about the ZIP code/geocoding so far, but it’s way too cool, and just wanna make a little note to myself in this blog. For an example you could go back to see my first blog in this blog site. The main process for matching the addresses/ZIP codes is: 1) Build/obtain reference data, which could be points, e.g. cities, counties, nations, or houses; polyline, e.g. streets, roads, and polygons, e.g. independent house, business centers; 2) Select address locator style, they are US Address-Dual Ranges, One Range, Single House, Street Name, City States, ZIP 5 Digit, ZIP +4, General-City State Country, General- Gazetteer, General-Field; 3) Build address locator, and then 4) Perform address matching. ArcGIS geocoding could do process this for you, and you could just run the geocoding through it. In the spatial analysis, besides the locations, the scale of your interest in are very important, for example, independent house and shopping mall are polygon in bigger spatial scale but they become points when you zoom out to a smaller scale.

location map

Creating interactive maps inside existing business systems can help users see patterns that graphs and charts cannot reveal. (ESRI)

Reference for the blog content except the complaining at the beginning and my own thought:

  1. location analysis for business from ESRI;
  2. Geocoding on WIKI ;
  3. Business pattern analysis data from U.S. Census;
  4. Business strength geocoding;