Areas of Interest driven by Cloud Raster Format and webtools in the ESRI portal
Context
- Basecamp thread with videos.
- Rapid summaries report: you will find some tricks to build the models of model builder, also a test of performance.
- CRF creation and storage: the process to build a crf is in this document.
- Creating a model with Model Builder: at the bottom of this document a description of creating a geoprocessing service with Model Builder.
IMPORTANT NOTE This documentation has been written using the ArcGIS Pro version 2.8.3
and the ArcGIS Enterprise Portal version 10.9 (Release May 2021)
.
Creating a crf
Check the document of CRF creation and storage for detailed steps. Before starting, make sure you are NOT building pyramids. This can add hours to the processing. Your Pro may be set to do this automatically. You can turn it off in Options (Access by clicking on the Project
tab in the top left corner). If you donโt want to make the change through the Options, you can simply make sure that you unclick the Build pyramids
option when running a geoprocessing service.
The summary of geoprocessing steps followed are:
- Create mosaic dataset (creates a container inside the geodatabase to hold the rasters )
- Add rasters to mosaic dataset (Add rasters to the container)
- Calculate Field (Create a unique id field)
- Build multidimensional Info (Add Dimension and Variable information)
- Table to table (Optional) (Create a lookup table to match the unique id and the raster name)
- Copy raster (Create the crf with all the previous information, this is the long step)
(This notebook has the sequence of geoprocessing commands in python.)
Things to be aware of:
- The projection of the rasters and the crf should match. Set the projection in the
Environments
tab of the first stepCreate mosaic dataset
(You can use one of the rasters as input of the Coordinate System. By default ArcGIS Pro uses Pseudo Mercator 3857) - This is a multidimensional processing operation. Set the
Parallel processing
option to 90% in theEnvironments
tab every time - The most critical step is
Build multidimensional info
, this is because it has to be clear what are thevariables
and what are thedimensions
. After creating a new field in the attribute table of the Mosaic dataset, use the following input as guide. This example is how the encroachment datacube has been created.
1. The geoprocessing parameters
IMPORTANT NOTE: Be aware that when you check the ran geoprocess in the history, automatically, the information in Variable Field
changes to the new Variable
field. Always check the python snippet to see what has been done.
arcpy.md.BuildMultidimensionalInfo("land_encroachment", "Variable_new", "SliceNumber # #", "ghm # #")
2. Then the multidimensional info properties of the Mosaic dataset should look like this:
3. And finally the changes in the table of attributes that shows the multidimensional information
Where do the CRFs live?
They live in an Azure bucket in the cloud. They are managed using Microsoft Azure Storage Explorer. There are different ways to access the CRFs.
Accessing the CRFs from ArcGIS Pro
The Virtual machine already has the .acs
file that makes the connection. The yaleCube.acs
file is located in Documents/ArcGIS/Projects
. When a new project is created a connection is done via the ribbon (Insert/Connections/Add Cloud Storage Connection). Then, the yaleCube.acs
appears on the catalog, within the Cloud Stores folder and we can add the CRFs stored there to our map and work with them in ArcGIS Pro.
Accessing the CRFs from the webtools
Once the webtools are created (process explained in the following sections), we can test them and used them calling the CRFs directly from the Azure bucket in the cloud. For that, we need to use the path /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/
and add thr name of the CRF we need to use.
Currently (March 2023), the services in use require these CRFs:
- For the biodiversity GP services:
CRF | Path |
---|---|
Birds | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/birds_equal_area_20211003.crf |
Amphibians | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/amphibians_equal_area_20211003.crf |
Mammals | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/mammals_equal_area_20211003.crf |
Reptiles | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/reptiles_equal_area_20211003.crf |
- For the contextual GP services:
CRF | Path |
---|---|
Ecological Land Units (ELU) | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/ELU.crf |
Population | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/population2020.crf |
WDPA | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/WDPA_Terrestrial_CEA_June2021.crf |
Human Pressure: energy and extractive resources | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/Extraction_TimeSeries_Reclassify_20230501.crf |
Human Pressure: Transportation | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/Transportation_TimeSeries_Reclassify_20230515.crf |
Human Pressure: Agriculture | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/Agriculture_TimeSeries_Reclassify_20230501.crf |
Human Pressure: Human intrusion | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/HumanIntrusion_TimeSeries_Reclassify_20230501.crf |
Human Pressure: Urban and built-up | /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft/Builtup_TimeSeries_Reclassify_20230501.crf |
Building and publishing a Geoprocessing Service or webtool
A Webtool is a geoprocessing tool that lives in the ArcGIS Enterprise Portal and can be called from the Half Earth Application to make calculations on the fly. The data shown when a user draws itโs own area of interest relies completely on these tools.
To create these tools, the first steps is to build a model locally in ArcGIS Pro, and then publish it to the Portal.
Create a Model in ArcGIS Pro
In the Catalogue, create a new model (Model Builder) in the projectโs Toolbox.
Once the model is ready, indicate which ovals are Parameters by right clicking on them (a small P
appears). Set their names so they are legible by the Front End, inside model builder rename the parameters (right click on the ovals). Currently we are using: geometry
, crf_name
and output_table
.
Use the tool calculate value
as much as possible (this is a Model Builder Utility tool), Python is quicker than adding extra geoprocessing tools.
Avoid using the tool Calculate field
because it generates problems when publishing the tool due to version incompatibilities between ArcGIS Pro and ArcGIS Enterprise Portal.
Since we are using Multidimensional cubes, it is key to set up the Parallel processing
to 90%
in any of the geoprocessing tools added to the model. That will speed up the processing.
IMPORTANT NOTE: Take into account the input and output coordinate reference systems. In this case, we are providing a Pseudo Mercator input (the geometry) and we are using it against an Equal Area projection (the crf). The geoprocessing tools automatically serve the output in the raster projection, without us having to manually add a re-projection step to change the crs of the geometry. However, it is a good practice to make explicit the CRS in the Environments
tab of each of the geoprocessing tools.
Publish a geoprocessing service
Once our Model (made with Model Builder) is ready, we can publish it as a geoprocessing service. For that, we should follow the next steps:
- Create a small polygon using the Sample tool (click on the pencil that shows up on the left of
Input location raster or features
and create a small polygon in an area of interest). In the Table of contents right click on the new polygon and click onZoom to Layer
to show only the polygon. - Create a small subset of the crf using
Subset Multidimensional raster
, set the environment setting ofextent
to โcurrent Displayโ. The reason for doing this is to avoid copying the entire datasets in the portal. Since the Model cannot be published without layers, we generate samples of the crf that are limited to the extent of the geometry that we are using for testing. The smaller the extent, the better, but we need to make sure the crf samples are slightly larger than the polygon used for testing, otherwise the tool would fail.
- Run the model using the polygon and the subset crf as a geoprocessing tool (By clicking on the Model inside the Toolbox directly). Select
90%
for parallel processing. -
From the History, right click on the model that has just been run and
Share as a Web tool
(make sure you are logged into the Production Portal, look at the top right, otherwise the option wonโt appear). - In the General Panel:
- Name the model as ModelName
Prod
- Add the model to the Production folder
- Click on
Copy all data
- Click on
Share with Everyone
- Name the model as ModelName
- In the Configuration panel:
- Change the
Message level
toInfo
(this will give more details in case of an error). - Increase the
Maximum number of records returned by server
to 100000. This is very important to avoid not returning a response to the front end.
- Change the
- In the Content panel configure the tool properties (click on the pencil on the right)
- Set the geometry to
User defined value
- Set the crf as
Choice list
, make sure only the subset crf is selected by clicking onOnly use default layers
. This is so only the minimum amount of data is copied, but also so there arenโt several elements in the choice list. - Add the description to the different parameters
- Set the geometry to
- Unclick the option:
Add optional output Feature Service Parameter
(we are not using this). - Analyse before publishing to check which parameters or info is missing on the description of the tool. Sometimes analyse has to be run a couple of times without having to change anything between analyses. There will always be a warning message saying that the data will be copied to the Portal, that is expected and ok.
- Click on
Publish
๐
Find the published GP Service on the Portal
Once you have succeeded with the publication of a webtool, it appears in the Portal
section of the Catalogue Panel. When hovering over the tool the url to the item in the Portal appears. Follow the url and it takes you to the Portal in the Web (You can also log in directly to the Portal with your credentials, go to Content and find the tool you want to check). The โlook and feelโโ of the portal is identical to ArcGIS online. Protect the tool from deletion and update to Public the sharing options in the settings. Then, in the Overview panel, on the bottom right you will find the url of the service. Click to View the tool in a new window.
On this new window click on Tasks
. The url that appears in the search bar is the one that the front end must use. The url should look like this: https://heportal.esri.com/server/rest/services/<Tool name>/GPServer/<task name>
.
Another way to get the URL is to click on the tool, and on the Overview panel Under Tools click on Service URL
, this will take you directly to the Tasks View and you have the URL on the top.
Test the new GP Service
In order to check that the GP service is working correctly before passing the URL to the front-end, we can simulate the call that the front-end would make in the ArcGIS REST API.
Access the tool
- Log in to the Production Portal (https://heportal.esri.com/portal/home) with the required credentials
- Go to Content > Production folder > choose a GP service
- Click on Service URL
- Go to bottom of page and click on
Submit Job
Fill parameters
To test the service, we need to provide a geometry
and the path to the CRFs used by the webtool.
- Geometry: the geometry needs to have a very specific format. The structure passed is a json that can be obtained by using the tool
Features To Json
in ArcGIS Pro: make sure the output path is set outside of the gdb so it can be accessed easily, and tick the boxes forFormatted JSON
andInclude Z values
.
This is an example of the geometry you need to add to the box:
{
"displayFieldName" : "",
"hasZ" : true,
"fieldAliases" : {
"OBJECTID" : "OBJECTID",
"Name" : "Name",
"Text" : "Text",
"IntegerValue" : "Integer Value",
"DoubleValue" : "Double Value",
"DateTime" : "Date Time",
"Shape_Length" : "Shape_Length",
"Shape_Area" : "Shape_Area"
},
"geometryType" : "esriGeometryPolygon",
"spatialReference" : {
"wkid" : 102100,
"latestWkid" : 3857
},
"fields" : [
{
"name" : "OBJECTID",
"type" : "esriFieldTypeOID",
"alias" : "OBJECTID"
},
{
"name" : "Name",
"type" : "esriFieldTypeString",
"alias" : "Name",
"length" : 255
},
{
"name" : "Text",
"type" : "esriFieldTypeString",
"alias" : "Text",
"length" : 255
},
{
"name" : "IntegerValue",
"type" : "esriFieldTypeInteger",
"alias" : "Integer Value"
},
{
"name" : "DoubleValue",
"type" : "esriFieldTypeDouble",
"alias" : "Double Value"
},
{
"name" : "DateTime",
"type" : "esriFieldTypeDate",
"alias" : "Date Time",
"length" : 8
},
{
"name" : "Shape_Length",
"type" : "esriFieldTypeDouble",
"alias" : "Shape_Length"
},
{
"name" : "Shape_Area",
"type" : "esriFieldTypeDouble",
"alias" : "Shape_Area"
}
],
"features" : [
{
"attributes" : {
"OBJECTID" : 1,
"Name" : null,
"Text" : null,
"IntegerValue" : null,
"DoubleValue" : null,
"DateTime" : null,
"Shape_Length" : 231978.71016606738,
"Shape_Area" : 3338690868.7937865
},
"geometry" : {
"hasZ" : true,
"rings" : [
[
[
-818493.72899999842,
5383774.996100001,
0
],
[
-755549.04740000144,
5382225.2217999995,
0
],
[
-756854.20630000159,
5329215.6889000013,
0
],
[
-819798.88789999858,
5330765.4632999972,
0
],
[
-818493.72899999842,
5383774.996100001,
0
]
]
]
}
}
]
}
- CRFs:
Depending on the GP service, we will need to provide the path to one or more than one CRFs. For example, the biodiversity GP services only use one CRF, the one corresponding to their taxa. The Contextual GP service, on the other hand, extracts information from different CRFs, so we need to provide the path to all of them.
Note that the default values that appear in the boxes represent the data that was used to publish the service, that is, small subsets of the original CRFs. So if we test the tool using a different geometry but we do not change the default CRF paths, the new geometry will fall outside the extent of the subset data and we will get an error. For that reason, we need to make sure we substitute the default names with the complete path to the Azure bucket /cloudStores/HECloudstore_ds_vwkuvgmvcfqewwft
and the name of the corresponding CRF.
Submit job
Once the boxes have the geometry in a json format and the paths to the required CRFs, we can click on Submit Job (POST)
. To see how the process is going and check if there are any erros, we can click on Check Job details
and get updates on the process progress.
When the process finishes, we get links to the output tables, which provide results in a JSON format.
Current GP services in use
The section AOI_summaries provides information about the most recent Geoprocessing Services. They were created in March 2023 for the implementation of the AOI richer summaries, which included new calculations such as the SPS and the incorporation of new human pressure layers.
Historic of AOIs and its maintenance
The AOIs created by users can be shared with an url. When the urls are created, the data is also sent to an AGOL table where the data is stored. When the recepient uses the url, the same data will be displayed without having to call the GP service. Currently, this is the Service being tested, but there are previous versions in the folder #2 aois (aoi-historic and aoi-historic-dev)
Cleaning the historic AOIs service
The notebook saved in the organisation is ready to be activated and start the cleaning every first of the month. A version for reference can be found in the he-scratchfolder. The important variable to check is the limit number of features that the service shoud have: feature_limit
.
Information from the first iteration of geoprocessing services
Just in case some of this information is needed in the future, we have kept the documentation written when the first GP services were created (2021):
Some details about the tools used within the GP Services when working with CRFs
NOTE: This documentation has been written using the ArcGIS Pro version 2.6.4
and the Portal version 10.8.1
, so there might be some differences with newer versions.
Sample
: This is a super powerful tool. Its power lays on the fact that it is the first tool developed to deal with multidimensional data. Our testing showed that the time of processing increases as the area of interest increases. To use it in the portal server it was necessary to provide a new field to the polygon that had an integer.
-
Zonal Statistics as Table
: Our testing showed that the time of processing increased as the number of slices increased, not the area of interest. Polygon to Raster
: When rasterizing a polygon for the purpose of calculating proportions it is key that the cell size is the same as the input crf. In this case, opposite toSample
, the field that worked well wasOBJECTID
. In the ArcGIS Pro version we were working, the raster created did not have an attribute table. Within Model Builder we got the number of pixels usingCalculate Value
and the following Python codeblock:getAreaRaster(r"%custom_raster%")
It is key to use the right double quotes.import arcpy import numpy def getAreaRaster(rst): arr = arcpy.da.TableToNumPyArray(rst, "COUNT") a,= arr.tolist()[0] return a
%custom_raster%
refers to the output fromPolygon to Raster
. The%
uses ESRIโs in-line variable scripting.Filtering a table using SQL
: To obtain only the necessary rows, we have usedTable Select
. This tool uses an SQL expression that is built usingCalculate value
and python codeblock.
Example 1: Getting the top 20% most prevalent species
getTopRows(r"%table_in%")
import arcpy import numpy as np def getTopRows(table, prop = 0.2 ): arr = arcpy.da.TableToNumPyArray(table,['SliceNumber',"COUNT"]) n = int(round(prop * len(arr), 0)) sort_arr = np.sort(arr, order = "COUNT",)[0:n] arr_lit = sort_arr['SliceNumber'].tolist() arr_int = map(int, arr_lit) res = ', '.join(map(str, arr_int)) out = f"SliceNumber IN ({res})" return out
Example 2: Getting only the rows with presence. This might be unecessary if used in Select Table
. Select Table
allows for a WHERE
query.
getPresentSpecies(r"%table_in%")
import arcpy import numpy as np def getPresentSpecies(table): arr = arcpy.da.TableToNumPyArray(table,["SliceNumber","presence"]) out_arr = arr[arr["presence"]>0] arr_lit = out_arr["SliceNumber"].tolist() arr_int = map(int, arr_lit) res = ', '.join(map(str, arr_int)) out = f"SliceNumber IN ({res})" return out
About the limit of records returned: If you forget to set a high limit for the records returned, the front end might inform of this limit. In the object w
returned, check value.exceededTransferLimit
.
Models from Model Builder as Python code
The process inside the Geoprocessing service can be found in the he-scratchfolder
repo.
Creating the lookup tables
The process of creation of the tables consist on getting the slice number and matching name from the raster mosaic dataset and then merge using the scientific name with data from MOL. (notebook).
First iteration of GP Services
| Front end element | Crf name |Crf variable| Gp service | Output to use | Field to use from response | AGOL table to use | AGOL field to use | |โ|โ|โ|โ|โ|โ| | population | population2020.crf |none| GP ContextualLayersProd20220131|output_table_population|SUM
| none | none | | climate_regime | ELU.crf|none| GP ContextualLayersProd20220131|output_table_elu_majority |MAJORITY
| agol link | cr_type
contains the name of the type of climate regime | | land_cover | ELU.crf|none| GP ContextualLayersProd20220131|output_table_elu_majority |MAJORITY
| agol link | lc_type
contains the name of the type of land cover | | human_encroachment | land_encroachment.crf |none| GP ContextualLayersProd20220131|output_table_encroachment| SliceNumber
has the code of the type of human activity, percentage_land_encroachment
gives percentage of each type|agol link|SliceNumber
to join and then Name
| | Protection_percentage | WDPA_Terrestrial_CEA_June2021.crf | none| GP ContextualLayersProd20220131 |output_table_wdpa_percentage| percentage_protected |none |none | | WDPA list | none | none| GP ContextualLayersProd |output_table_wdpa| ORIG_NAME,DESIG_TYPE,IUCN_CAT,GOV_TYPE,AREA_KM,NAME_0
|agol link non whitelisted yet|WDPA_PID
| | mammal_data | mammals_equal_area_20211003.crf | presence
|GP SampleMamProd20220131 | output_table| SliceNumber
has the code of the species; per_global
shows the area relative to the global species range; per_aoi
shows the % area present inside the aoi. |FS lookup tableWhitelisted table| SliceNumber
, scientific_name
, percent_protected
,conservation_target
,has_image
,common_name
| | amphibian_data | amphibians_equal_area_20211003.crf |amphibians
| GP SampleAmphProd20220131 | output_table| SliceNumber
has the code of the species; per_global
shows the area relative to the global species range; per_aoi
shows the % area present inside the aoi. |FS lookup tableWhitelisted table| SliceNumber
, scientific_name
, percent_protected
,conservation_target
,has_image
,common_name
| | bird_data | birds_equal_area_20211003.crf | birds
|GP SampleBirdsProd20220131 | output_table| SliceNumber
has the code of the species; per_global
shows the area relative to the global species range; per_aoi
shows the % area present inside the aoi. |FS lookup tableWhitelisted table| SliceNumber
, scientific_name
, percent_protected
,conservation_target
,has_image
,common_name
| | reptile_data | reptiles_equal_area_20211003.crf | reptiles
|GP SampleReptProd20220131 | output_table| SliceNumber
has the code of the species; per_global
shows the area relative to the global species range; per_aoi
shows the % area present inside the aoi. |FS lookup tableWhitelisted table| SliceNumber
, scientific_name
, percent_protected
,conservation_target
,has_image
,common_name
|
Source of data
Population: WorldPop 2020 - web World Terrestrial Ecosystem - Living Atlas
Querying the AGOL tables
For those Geoprocessing services that require to query information from a table in ArcGIS Online, Arcade can be used to return the information (more about Arcade in this docs). The Filter
function accepts an SQL expression and a layer.
The structure of the SQL expression is composed of the name of the field to query (in our case SliceNumber
), then the condition IN
and between parenthesis all the ids of species returned by the geoprocessing service.
var lay = $layer
var sqlExpr = 'SliceNumber IN (164, 250)'
var val = Filter(lay, sqlExpr)
return val