Importing Data

mapdata uses a single file or table of data, where the only requirement is that the table have columns for geographic coordinates such as latitude and longitude values. Data can be imported from any of three different formats:

  • Comma-separated value (CSV) files or similar delimited formats.

  • Spreadsheets, either Open Document (*.ods) or Excel (*.xlsx, *.xls).

  • Database tables, from client-server or file-based databases.

There are four different methods by which data can be imported:

  • Command-line options can be used to specify the data source, the names of the columns containing geographic coordinates, and other optional information.

  • If upon starting, mapdata finds a configuration file and that file identifies a data source in the connect section and geographic coordinate columns in the defaults section, those setting will be used to import the data table and initialize the map.

  • When started from a GUI interface, such as when using a menu or shortcut, and without command-line parameters or configuration settings, mapdata will present a wizard–a series of dialogs–that will prompt for the data source and other information.

  • When mapdata is already running, options on the File menu allow a new data table to be imported to replace the one currently being used.

If mapdata will be used repeatedly with the same data source, then creating a shortcut that includes command-line options, or using a configuration file, will simplify data import on repeated launches of the program.

Data Table Requirements

There are very few strict requirements for the format or content of data files or tables to be imported, and most of these will ordinarily be met by most data sources. The requirements are:

  • There must be a single header row with a unique name for every data column.

  • Data rows should immediately follow the header row.

  • No data row should have more data values than column headers.

  • There must be columns for latitude and longitude coordinates. Geographic coordinates are expected to be in decimal degrees in WGS84 coordinates by default, though alternate coordinate reference systems (CRS) can be used if the corresponding CRS identifier is provided when data are imported. Geographic coordinates may be missing for some, but not all, data rows.

The first three of these requirements can only be violated if the data source is a CSV file or spreadsheet.

There are no strict limitations on column header names, though if, after import, data will be selected using a SQL query, some column names may be modified to conform to requirements for SQL object names (specifically, invalid characters will be replaced by underscores).

Importing Data With Command-Line Specifications

When launching mapdata using command line arguments, the following flags and arguments can be used on the command line. Items in angle brackets must be replaced by appropriate values.

Options:
   -a <table>             The name of a database table to import.
   -c <color column>      The name of the column containing color
                          names
   -d <database>          The name of a client-server database from
                          which to import data
   -e <server>            The name of the server for a client-server
                          database
   -f <filename>          The name of the CSV file, spreadsheet file,
                          or file-based database to use.
   -g <image file>        The name of an image file to create
   -i <id column>         The name of the column containing location
                          identifiers
   -k <db type>           A one-letter code identifying the type of
                          client-server database from which to import
                          data.  Valid values are: 'p'-PostgreSQL,
                          's'-SQL Server, 'l'-SQLite, 'm'-MySQL,
                          'k'-DuckDB, 'o'-Oracle, 'f'-Firebird.
   -m <message>           Text to display above the map.  This should
                          be double-quoted
   -n                     Do not prompt for a password for client-server
                          databases.  By default, if a user name is
                          provided, a dialog box will be used to prompt
                          for the user's password
   -o <port>              The port for a client-server database.  If
                          omitted, the default port for the DBMS will
                          be used
   -p <projection>        The coordinate reference system (CRS) if a
                          projected coordinate system is used
   -r <script file>       The filename of a SQL script file to run
                          before importing a database table.
   -s <symbol column>     The name of the column containing symbol
                          names
   -t <sheet name>        The name of the worksheet to import data
                          from, for spreadsheet data sources
   -u <user name>         The name of a client-server database user
   -w <image_wait>        The time to wait before creating the image
                          file, in seconds (default is 12)
   -x <longitude column>  The name of the column containing longitude
                          values
   -y <latitude column>   The name of the column containing latitude
                          values

Latitude and longitude values in the data table are assumed to be in decimal degrees in the WGS84 datum. If they are in any other coordinate system, the CRS of that coordinate system must be provided by using the -p (projection) argument.

Instead of marking every location with the default symbol (a black open triangle), locations can instead be marked with distinct symbols and colors. The markers to use should be identified in the symbol and color columns of the data file. Either or both of these columns may be used. The default marker type and color can also be changed using settings in a configuration file.

If no command-line arguments are provided, mapdata will start in GUI mode. When command-line arguments are used, there are several different valid combinations of arguments for different purposes, as listed in a following section.

Importing from a CSV file

Required arguments are:

  • -f <filename>

  • -x <longitude column>

  • -y <latitude column>

Optional arguments are:

  • -p <projection>

  • -i <id column>

  • -s <symbol column>

  • -c <color column>

  • -m <message>

The -p (projection) argument must be used if the latitude and longitude are in any coordinate system other than WGS84.

Importing from a spreadsheet file

Required arguments are:

  • -f <filename>

  • -t <sheet name>

  • -x <longitude column>

  • -y <latitude column>

Optional arguments are:

  • -p <projection>

  • -i <id column>

  • -s <symbol column>

  • -c <color column>

  • -m <message>

The -p (projection) argument must be used if the latitude and longitude are in any coordinate system other than WGS84.

Importing from a file-based database

SQLite and DuckDB are file-based databases.

Required arguments are:

  • -k <db type>

  • -f <filename>

  • -a <table>

  • -x <longitude column>

  • -y <latitude column>

Optional arguments are:

  • -r <script file>

  • -p <projection>

  • -i <id column>

  • -s <symbol column>

  • -c <color column>

  • -m <message>

The db type argument must be either ‘l’ or ‘k’ for SQLite or DuckDB, respectively.

If a SQL script file name is provided, the script file will be run against the database before the table is imported. Mapdata supports SQL script extensions to provide additional features for SQL scripting and to standardize scripts across DBMSs.

Importing from a client-server database

Required arguments are:

  • -k <db type>

  • -e <server>

  • -d <database>

  • -a <table>

  • -x <longitude column>

  • -y <latitude column>

Optional arguments are:

  • -o <port>

  • -u <user name>

  • -n

  • -r <script file>

  • -p <projection>

  • -i <id column>

  • -s <symbol column>

  • -c <color column>

  • -m <message>

The database port only need be specified if the database is using a non-standard port.

If a user name is specified, mapdata will prompt for the user’s password unless the -n flag is also used.

If a SQL script file name is provided, the script file will be run against the database before the table is imported. The same database connection is used to run the script and import the data table, so the script can create temporary tables or views in the database for the data to be imported. Mapdata supports SQL script extensions to provide additional features for SQL scripting and to standardize scripts across DBMSs.

Specifying Data Import Using a Configuration File

Numerous configuration settings will be automatically read from configuration files on startup if one or more configuration files are found. Those settings can include specification of a data source and the names of columns with geographic coordinates.

If settings that are read from configuration files (possibly in combination with command-line arguments) are sufficient to identify a data source and geographic coordinate columns, then mapdata will use those settings to import data on startup.

Configuration settings that affect data import on startup are:

  • Settings in the connect section specify the data file or database to use as a data source.

  • The x_column and y_column in the defaults section specify the names of the columns with geographic coordinates.

The Configuration Files documentation contains more details on these and other settings.

Importing Data with the GUI Wizard

If mapdata is started and there are no command-line arguments or configuration file settings that identify a data source and geographic coordinate columns, then mapdata will prompt for this information with a series of interactive dialogs. These dialogs act as a multi-step ‘wizard’ that allows interactive browsing of files and column names to find and specify the required information.

The first step of the data import wizard is to select the type of data source.

The prompt for the type of data source to use

Each of the three buttons on this dialog will result in the display of a different dialog that is specific to the data type, i.e., for a CSV data, spreadsheet, or database data source.

The “Cancel” button will exit mapdata.py without selecting or displaying any data.

Open CSV Data File

The prompt for a CSV file to open

This dialog is displayed when mapdata.py is started in GUI mode and a CSV data source is selected, and also when the File / Open CSV menu option is selected.

This dialog is used to select the data file that will be shown in the application’s map and table displays. This dialog prompts for several pieces of information, some of which are required and some of which are optional. The “OK” button remains disabled until all of the required information is entered.

Required Information

File

The name of the data file must be entered. This must be a comma-separated-value (CSV) file with an extension of “.csv”. The first line of the data file must contain column names. The filename may be typed into the prompt or the “Browse” button can be used to find and select the desired file.

Latitude column

The name of the column in the data file that contains latitude values. Latitude values are expected to be in decimal degrees, in the WGS84 datum, by default. If the latitude (and longitude) values are in some other coordinate system, the appropriate Coordinate Reference System (CRS) value must be entered following the “CRS” prompt.

The drop-down list of column names will initially be empty, but will be populated after a valid CSV file has been identified.

Longitude column

The name of the column in the data file that contains longitude values. Longitude values are expected to be in decimal degrees, in the WGS84 datum, by default. If the longitude (and latitude) values are in some other coordinate system, the appropriate Coordinate Reference System (CRS) value must be entered following the “CRS” prompt.

The drop-down list of column names will initially be empty, but will be populated after a valid CSV file has been identified.

Optional Information

Label column

The name of a column in the data file that is to be used as a label for each location. The label will be displayed either above or below the marker for each location (below, by default, though this is configurable. If no column name is provided, locations on the map will not be labeled.

The drop-down list of column names will initially be empty, but will be populated after a valid CSV file has been identified.

CRS

The Coordinate Reference System (CRS) code identifies the units, datum, and projection of the latitude and longitude values in the data file. The default value, 4326, corresponds to units of decimal degrees, in the WGS84 datum, and unprojected. If the coordinates are projected, the appropriate CRS must be entered.

mapdata.py does not provide a dropdown list will all valid CRS values. If the latitude and longitude values are projected, you must know the CRS value. If there is some uncertainty about the correct CRS value, the Map / Change CRS menu item can be used to change the CRS of a data file after it is imported.

Symbol column

The name of a column in the data file that contains the names of symbols that are to be displayed at each location instead of the default symbol. mapdata.py includes 24 built-in symbols that can be used. Additional symbols can be loaded in a configuration file and using the File / Import symbol menu option.

The drop-down list of column names will initially be empty, but will be populated after a valid CSV file has been identified.

Color column

The name of a column in the data file that contains the names of colors for the symbols at each location. The available color names are listed on the Colors page.

The drop-down list of column names will initially be empty, but will be populated after a valid CSV file has been identified.

Description

Text to be displayed above the map on the main display window. This may describe the data set or contain other information helpful to the user.

Open Spreadsheet Data File

This dialog is displayed when mapdata.py is started in GUI mode and a spreadsheet data source is selected, and also when the File / Open spreadsheet menu option is selected.

This dialog has three pages.

Page 1:

First page of the dialog to select and open a spreadsheet data source

Page 2:

Second page of the dialog to select and open a spreadsheet data source

Page 3:

Third page of the dialog to select and open a spreadsheet data source

On the first page the spreadsheet file must be identified. Either an OpenDocument (.ods) or an Excel (.xlsx and .xls) spreadsheet may be selected. The ‘Next’ button cannot be selected until a file name is entered.

The first page also allows an optional description of the data set to be entered. This description will appear above the map display.

The second page prompts for the name of the worksheet to read, and the number of initial rows to skip (if any). The name of the worksheet can be selected from a drop-down list of all of the worksheets in the workbook. The ‘Next’ button on this page cannot be selected until the sheet name is identified.

The third page of the dialog prompts for additional required and optional information that specifies what data to display, and how it is to be displayed. These elements of this page are described below. The ‘OK’ button cannot be selected until the latitude and longitude columns are identified. After the ‘OK’ button is selected, mapdata.py will create or update the map and table display to show the selected data.

Required Information

Latitude column

The name of the column in the spreadsheet that contains latitude values. Latitude values are expected to be in decimal degrees, in the WGS84 datum, by default. If the latitude (and longitude) values are in some other coordinate system, the appropriate Coordinate Reference System (CRS) value must be entered following the “CRS” prompt.

Longitude column

The name of the column in the spreadsheet that contains longitude values. Longitude values are expected to be in decimal degrees, in the WGS84 datum, by default. If the longitude (and latitude) values are in some other coordinate system, the appropriate Coordinate Reference System (CRS) value must be entered following the “CRS” prompt.

Optional Information

Label column

The name of a column in the spreadsheet that is to be used as a label for each location. The label will be displayed either above or below the marker for each location (below, by default, though this is configurable. If no column name is provided, locations on the map will not be labeled.

CRS

The Coordinate Reference System (CRS) code identifies the units, datum, and projection of the latitude and longitude values in the data file. The default value, 4326, corresponds to units of decimal degrees, in the WGS84 datum, and unprojected. If the coordinates are projected, the appropriate CRS must be entered.

mapdata.py does not provide a dropdown list will all valid CRS values. If the latitude and longitude values are projected, you must know the CRS value. If there is some uncertainty about the correct CRS value, the Map / Change CRS menu item can be used to change the CRS of a data file after it is imported.

Symbol column

The name of a column in the spreadsheet that contains the names of symbols that are to be displayed at each location instead of the default symbol. mapdata.py includes 24 built-in symbols that can be used. Additional symbols can be loaded in a configuration file and using the File / Import symbol menu option.

Color column

The name of a column in the spreadsheet that contains the names of colors for the symbols at each location. The available color names are listed on the Colors page.

Open Database Data Source

This dialog is displayed when mapdata.py is started in GUI mode and a database data source is selected, and also when the File / Open database menu option is selected.

This dialog has three pages.

Page 1:

First page of the dialog to select and open a database data source

Page 2:

Second page of the dialog to select and open a database data source

Page 3:

Third page of the dialog to select and open a database data source

The first page prompts for the type of database management system (DBMS) that is to be used, and the connection parameters necessary to connect to that database.

The supported DBMSs are:

  • PostgreSQL

  • SQLite

  • MariaDb/MySQL

  • SQL Server

  • Oracle

  • Firebird

  • DuckDB

The connection parameters that are needed differ for client-server databases and for file-based databases. SQLite and DuckDB are file-based databases, the others are client-server databases.

The first page of the dialog prompts for the following information for client-server databases:

Server

The host name or IP address of the database server. This may be local or remote. If a host name is used, it must be present in the system’s hosts file. mapdata.py does not check this value.

Database

The name of the database to be used. mapdata.py does not check this value.

User

The name of the database user. This may or not be required, depending on the specific DMBS and how it is configured. mapdata.py does not check this value.

Password

The database connection password for the specified user. This may or may not be required, depending on the specific DMBS and how it is configured. The password is obscured–shown as asterisks–when it is entered. mapdata.py does not check this value.

Port

The port on the server that is used to connect to the database. The default port value for each of the client-server databases will be used if no alternate port is specified. mapdata.py does not check this value.

The ‘Next’ button on the first page of this dialog cannot be selected until values have been entered for the server and database for client-server databases.

For file-based databases, the first page of the dialog prompts for the following information:

Database file name

The name of the file containing the database table with data to be mapped.

The ‘Next’ button on the first page of this dialog cannot be selected until values have been entered for the file name for file-based databases.

The second page of the dialog prompts for the name of the database table that contains the data to be mapped. The table name should be schema-qualified, if appropriate, for those DMMSs that support schemas. mapdata.py does not check this value.

The second page of the dialog also allows entry of SQL statements that will be executed prior to reading data from the selected table. The intended use of these SQL statements is to create (temporary) tables or views in the database to be read by mapdata.py if there is no base table that contains the desired information. The Open button reads in the contents of an existing SQL script file; the Save button saves the SQL commands in a new or existing script file, and the Edit button opens the SQL commands in an external editor. The Edit button is disabled if no editor has been identified, either by an environment variable named “EDITOR” or by specification in a configuration file.

The SQL statements entered on the second page of this dialog can use metacommands and substitution variables to implement conditional tests and loops. These features are described in the SQL Script Extensions section.

When the ‘Next’ button on the second page of the dialog is selected, mapdata.py will attempt to connect to the database and read data from the selected table. If any of the connection parameters are incorrect, mapdata.py will display an error message.

The third page of the database selection dialog prompts for additional required and optional information that specifies what data to display, and how it is to be displayed. These elements of this page are described below. The ‘OK’ button cannot be selected until the latitude and longitude columns are identified. After the ‘OK’ button is selected, mapdata.py will create or update the map and table display to show the selected data.

Required Information

Latitude column

The name of the column in the database table that contains latitude values. Latitude values are expected to be in decimal degrees, in the WGS84 datum, by default. If the latitude (and longitude) values are in some other coordinate system, the appropriate Coordinate Reference System (CRS) value must be entered following the “CRS” prompt.

Longitude column

The name of the column in the database table that contains longitude values. Longitude values are expected to be in decimal degrees, in the WGS84 datum, by default. If the longitude (and latitude) values are in some other coordinate system, the appropriate Coordinate Reference System (CRS) value must be entered following the “CRS” prompt.

Optional Information

Label column

The name of a column in the database table that is to be used as a label for each location. The label will be displayed either above or below the marker for each location (below, by default, though this is configurable. If no column name is provided, locations on the map will not be labeled.

CRS

The Coordinate Reference System (CRS) code identifies the units, datum, and projection of the latitude and longitude values in the data file. The default value, 4326, corresponds to units of decimal degrees, in the WGS84 datum, and unprojected. If the coordinates are projected, the appropriate CRS must be entered.

mapdata.py does not provide a dropdown list will all valid CRS values. If the latitude and longitude values are projected, you must know the CRS value. If there is some uncertainty about the correct CRS value, the Map / Change CRS menu item can be used to change the CRS of a data file after it is imported.

Symbol column

The name of a column in the database table that contains the names of symbols that are to be displayed at each location instead of the default symbol. mapdata.py includes 24 built-in symbols that can be used. Additional symbols can be loaded in a configuration file and using the File / Import symbol menu option.

Color column

The name of a column in the database table that contains the names of colors for the symbols at each location. The available color names are listed on the Colors page.

Description

Text to be displayed above the map on the main display window. This may describe the data set or contain other information helpful to the user.