A Simple Map Explorer for Coordinate Data

An example of the mapdata user interface

mapdata.py is a Python program that simultaneously displays an interactive map and table of data to facilitate data exploration. The data table must contain columns of geographic coordinates so that locations can be displayed on the map.

Selecting (clicking on) a point on the map highlights the corresponding row in the table, and vice-versa. Data can also be selected and highlighted by entering query expressions that will be applied to the data table.

Data can be read from a CSV file, spreadsheet file, or database.

In addition to the map and table displays, data can be viewed in several different types of plots, and several different types of statistical summaries and visualizations produced. Plots and statistical summary may include all data points in the table or only selected data points. If data selections are changed on the map or table, plots and other summaries of selected data will, by default, be updated automatically.

mapdata is intended to provide an easy way to quickly explore a spatial data set. Although it includes some simple data analysis tools (e.g., linear regression), it is not intended to be a comprehensive analytics platform for either spatial or tabular data. Easy and flexible exploration of a data set with mapdata may, however, reveal characteristics of a data set that may be worthwhile to investigate in more detail with a geographic information system or statistical software.

Uses

You can use mapdata to:

  • Quickly visualize data in a spatial context. This may be useful for exploration of an unfamiliar data set.

  • Review attributes of locations on the map by clicking on those locations and examining the highlighted rows in the table.

  • Distinguish different types or categories of locations on the map, using different symbols or colors. This may be useful for visualizing data that are summarized from a known source, such as a database query that assigns symbols based on data attributes.

  • Use expressions to select data rows based on data values, and highlight all of those data values in the table and on the map.

  • Select data for further analyses by exporting the selected subset of rows from the data table to a new data file.

  • Evaluate different coordinate reference systems (CRSs) for data sets where the CRS is uncertain. This may be needed as part of the exploration of a new data set.

  • Export a map graphic showing the locations of points in the data file, possibly including markers on selected locations.

  • Visualize univariate and bivariate relationships among variables, using either all data or selected data, using figures such as scatter plots, line charts, and box plots.

  • Export figure graphics to document and communicate features of the data set.

Example histogram

Data Requirements

Data to be displayed on the map must be in a CSV (comma-separated value) file, a spreadsheet file, or in a database table. OpenDocument and Excel spreadsheets are supported. Data can be pulled from PostgreSQL, SQLite, DuckDB, MariaDB/MySQL, SQL Server, Oracle, or Firebird databases. Two columns in the data table must contain latitude and longitude values. These values are assumed, by default, to be in decimal degrees in the WGS84 datum. This coordinate system is ordinarily described by a coordinate reference system (CRS) identifier of “4326”. (The names “SRID” and “EPSG code” are equivalent.) If the values are in some other coordinate system, then the CRS for that coordinate system must be provided either on the command line or in response to mapdata’s prompt for a data file.

GUI and Command-line Modes

mapdata can be launched in either of two ways. One way solely uses a graphical user interface (GUI), and the user will be initially prompted to select the data source to use. The other way uses command-line options to specify the data source and the columns containing latitude and longitude information. When mapdata is started using command-line options, all subsequent operations are carried out using the GUI.

Initial dialog prompting for a type of data source

GUI Mode

To start the program in GUI mode, simply double-click on the mapdata.py (or mapdata.pyw) program name. Alternatively a shortcut (on Windows) or a .desktop file (on Linux) can be used to launch the program. When it starts in GUI mode, mapdata will initially display a dialog box prompting for the type of data source to use, and then for specific details about the data source. Mapdata can also be started from the command line, with or without arguments.

Command-line Mode

When operating the program in command-line mode, the following flags and arguments can be used on the command line–items in angle brackets must be replaced by appropriate values:

Options:
   -a <table>             The name of a database table to import.
   -c <color column>      The name of the column containing color
                          names
   -d <database>          The name of a client-server database from
                          which to import data
   -e <server>            The name of the server for a client-server
                          database
   -f <filename>          The name of the CSV file, spreadsheet file,
                          or file-based database to use.
   -g <image file>        The name of an image file to create
   -i <id column>         The name of the column containing location
                          identifiers
   -k <db type>           A one-letter code identifying the type of
                          client-server database from which to import
                          data.  Valid values are: 'p'-PostgreSQL,
                          's'-SQL Server, 'l'-SQLite, 'm'-MySQL,
                          'k'-DuckDB, 'o'-Oracle, 'f'-Firebird.
   -m <message>           Text to display above the map.  This should
                          be double-quoted
   -n                     Do not prompt for a password for client-server
                          databases.  By default, if a user name is
                          provided, a dialog box will be used to prompt
                          for the user's password
   -o <port>              The port for a client-server database.  If
                          omitted, the default port for the DBMS will
                          be used
   -p <projection>        The coordinate reference system (CRS) if a
                          projected coordinate system is used
   -r <script file>       The filename of a SQL script file to run
                          before importing a database table.
   -s <symbol column>     The name of the column containing symbol
                          names
   -t <sheet name>        The name of the worksheet to import data
                          from, for spreadsheet data sources
   -u <user name>         The name of a client-server database user
   -w <image_wait>        The time to wait before creating the image
                          file, in seconds (default is 12)
   -x <longitude column>  The name of the column containing longitude
                          values
   -y <latitude column>   The name of the column containing latitude
                          values

Latitude and longitude values in the data table are assumed to be in decimal degrees in the WGS84 datum. If they are in any other coordinate system, the CRS of that coordinate system must be provided by using the -p (projection) argument.

Instead of marking every location with the default symbol (a black open triangle), locations can instead be marked with distinct symbols and colors. The markers to use should be identified in the symbol and color columns of the data file. Either or both of these columns may be used. The default marker type and color can also be changed using settings in a configuration file.

If no command-line arguments are provided, mapdata will start in GUI mode. When command-line arguments are used, there are several different valid combinations of arguments for different purposes, as listed below.

Import from a CSV file

Required arguments are:

  • -f <filename>

  • -x <longitude column>

  • -y <latitude column>

Optional arguments are:

  • -p <projection>

  • -i <id column>

  • -s <symbol column>

  • -c <color column>

  • -m <message>

The -p (projection) argument must be used if the latitude and longitude are in any coordinate system other than WGS84.

Import from a spreadsheet file

Required arguments are:

  • -f <filename>

  • -t <sheet name>

  • -x <longitude column>

  • -y <latitude column>

Optional arguments are:

  • -p <projection>

  • -i <id column>

  • -s <symbol column>

  • -c <color column>

  • -m <message>

The -p (projection) argument must be used if the latitude and longitude are in any coordinate system other than WGS84.

Import from a file-based database

SQLite and DuckDB are file-based databases.

Required arguments are:

  • -k <db type>

  • -f <filename>

  • -a <table>

  • -x <longitude column>

  • -y <latitude column>

Optional arguments are:

  • -r <script file>

  • -p <projection>

  • -i <id column>

  • -s <symbol column>

  • -c <color column>

  • -m <message>

The db type argument must be either ‘l’ or ‘k’ for SQLite or DuckDB, respectively.

If a SQL script file name is provided, the script file will be run against the database before the table is imported. Mapdata supports SQL script extensions to provide additional features for SQL scripting and to standardize scripts across DBMSs.

Import from a client-server database

Required arguments are:

  • -k <db type>

  • -e <server>

  • -d <database>

  • -a <table>

  • -x <longitude column>

  • -y <latitude column>

Optional arguments are:

  • -o <port>

  • -u <user name>

  • -n

  • -r <script file>

  • -p <projection>

  • -i <id column>

  • -s <symbol column>

  • -c <color column>

  • -m <message>

The database port only need be specified if the database is using a non-standard port.

If a user name is specified, mapdata will prompt for the user’s password unless the -n flag is also used.

If a SQL script file name is provided, the script file will be run against the database before the table is imported. The same database connection is used to run the script and import the data table, so the script can create temporary tables or views in the database for the data to be imported. Mapdata supports SQL script extensions to provide additional features for SQL scripting and to standardize scripts across DBMSs.

Automated map image generation

Mapdata can be used to automatically create an image file containing the basemap and location markers. To do this, command-line arguments for one of the types of data sources (described above) must be provided, and in addition, the following two arguments can be used to specify that an image should be exported:

  • -g <image file>

  • -w <image_wait>

The -w argument specifies how long, in seconds, that mapdata should wait after loading the data before exporting the map image. The delay allows time for basemap tiles to be downloaded and added to the map. The default value is 12 seconds. The -w argument may need to be used if this is not sufficient time.

If a password is required for a client-server database, the prompt for this password will not affect the time allocated to download basemap tiles.