Wednesday, June 12, 2024

Automating Geoprocessing with Python

Moving into Module 5, our assignment this week consists of writing a Python script to automate geoprocessing tasks. We were provided a dataset with several Alaska and New Mexico feature classes and the task of copying them into a newly created file geodatabase (fGDB). Working with lists and dictionaries, our script also implements several functions including ListFeatureClasses, Describe, and SearchCursor. Our end task was to output a dictionary (a Python list with pairs of keys:values) of New Mexico county seats.

As an aside, I always like working with data from places I have visited. Including an Albuquerque airport stop in 2007, I've been to New Mexico four times. I also updated the Universal Map Group Wall Map for Albuquerque as one of my projects in 2008.

Tijeras, New Mexico in 2017
I-40 west ahead of Tijeras from my trip to New Mexico in 2017.

The ListFeatureClasses() function returns a list of feature classes in the current workspace. This list can be stored in a variable to be used with subsequent functions. Part of the ArcPy package, the Describe() function returns a Describe object which includes properties of a feature class such as data type, geometry type, basename, etc.

Using the Describe function on the variable with the feature class list allows a property, such as the basename, to be used as part of the CopyFeatures() function in the Data Management toolbox. This function copies input feature class features into a new feature class. With a for loop, we used this to populate our newly created file geodatabase (fGDB) with a concatenation of a variable for the output environment, the name of our created fGDB and the basename property of each feature class.

The Flow Chart for this week's Lab assignment
The program flowchart for this week's Lab assignment

While the term cursor to me references the blinking vertical line in this word processor, it has a separate meaning in computer programming. Cursor is a database technology term for accessing a set of records in a table. Records in a table are referred to as rows. Iterating over the row of a table, the three cursors in Python are as follows:

  • Search Cursor - this function retrieves specific rows on the basis of attribute values. This is similar to performing a SQL query.
  • Insert Cursor - this function adds new roads to a table, which can in turn be populated with new attribute values.
  • Update Cursor - this function modifies the attribute data of existing rows or deletes rows from the table.
Each type of cursor is created by corresponding class of the arcpy.da.module. The cursor object accessed row objects, returning a tuple of field values in an order specified by the field names argument of the function. The cursor iterates over all table rows but using only specific fields as needed.

Cursors support with statements, which have the advantage of executing regardless of whether the cursor finished running successfully and completing without a data lock. A data lock ensues otherwise is a cursor is not deleted with a del statement. The data lock prevents multiple processes from changing the same table simultaneously.

With statements also incorporate SQL queries as the where_clause optional parameter in the SearchCursor syntax. SQL queries find records in a database table based upon specific criteria selecting data WHERE specific conditions occur. I often use SQL queries when updating the database for the main AARoads pages and also occasionally with the Simple Machines Forum database on the site.

The Lab assignment specified using the SearchCursor on the feature class of point data for cities in the state of New Mexico. The search criteria fields were the city NAME, the city FEATURE type and the 2000 Census population data (POP_2000). I found that assigning the SearchCursor output as variables made formatting the print statement vastly easier and cleaner looking codewise.

The biggest challenge I had was populating the dictionary with the county seats. I eventually incorporated the process in the for loop of the SearchCursor with statement. Used the W3Schools web site to narrow down the code to use, and with the previous variables, creating the dictionary was simple.

Looking at the example output included in the Lab assignment PDF, I opted to expand upon the formatting. Being the county collector that I am, I incorporated the COUNTY name field so that the output listed the associated county with each seat. Furthermore after discovering that three of the entries lacked population data, I implemented an if statement. The missing data values showed -99999. Being that it was an integer, I first cast that as a string, and then reassigned the population variable to read "not available".
Output showing completion of Geoprocessing tools
Collage of screen shots showing the output of the Module 5 Python Script


No comments:

Post a Comment