sdwis_drink_water

sdwis-drink-water: Safe Drinking Water Information System (SDWIS) API Wrapper

Subpackages

Submodules

Package Contents

Classes

SdwisAPI

The SdwisAPI class provides an interface to interact with the EPA's SDWIS data service.

SdwisTable

A class to interact with the SDWIS Table API.

ResultDataParser

EnforcementAction

GeographicArea

LcrSample

LcrSampleResult

ServiceArea

Treatment

Violation

ViolationEnfAssoc

WaterSystem

WaterSystemFacility

Functions

tabulate_for_jupyter(table)

print_column_description(result_dict)

print_result_data(result_data)

print_columns(column_names)

Attributes

__version__

__author__

__license__

api

lcr_sample

lcr_sample_result

violation

water_system

water_system_facility

geographic_area

enforcement_action

service_area

treatment

violation_enf_assoc

sdwis_drink_water.__version__ = '1.1.0'
sdwis_drink_water.__author__ = ''
sdwis_drink_water.__license__ = 'MIT'
class sdwis_drink_water.SdwisAPI(base_url='https://data.epa.gov/efservice', retry_count=0, retry_delay=0, timeout=10, user_agent=None, enable_cache=True, cache_time=3600, print_url=False)

The SdwisAPI class provides an interface to interact with the EPA’s SDWIS data service. It supports making HTTP requests with optional caching and retries.

base_url

The base URL for the EPA’s data service.

Type:

str

retry_count

Number of times to retry a request on failure.

Type:

int

retry_delay

Delay between retries in seconds.

Type:

int

timeout

Timeout for the HTTP requests.

Type:

int

enable_cache

Flag to enable or disable request caching.

Type:

bool

cache_time

Duration for which the cache is valid.

Type:

int

print_url

Flag to enable or disable printing of the request URL.

Type:

bool

user_agent

User agent string for the HTTP requests.

Type:

str

get_request(url, endpoint_parameters=(), params=None, only_count=False, headers=None, use_cache=True, multi_mode=False, **kwargs)

Sends a GET request to the specified URL with optional parameters.

Parameters:
  • url (str) – The URL to send the request to.

  • endpoint_parameters (tuple) – Additional parameters for the endpoint.

  • params (dict) – Query parameters for the request.

  • only_count (bool) – Flag to return only count in the response.

  • headers (dict) – HTTP headers for the request.

  • use_cache (bool) – Flag to use cache for the request.

  • multi_mode (bool) – Flag to enable or disable multi-threading mode.

  • kwargs – Additional keyword arguments.

Returns:

Parsed JSON response from the API / Or number

Return type:

dict or int

Raises:

SdwisHTTPException – If the HTTP request fails or returns an error.

class sdwis_drink_water.SdwisTable(enable_cache=True, print_url=False)

A class to interact with the SDWIS Table API.

sdwis_api

Instance of the SdwisAPI class.

Type:

SdwisAPI

_get_data_by_request(query_url, print_to_console=False, multi_threads=False)

Internal method to get data by sending a request to the specified URL.

Parameters:
  • query_url (str) – The URL to send the request to.

  • print_to_console (bool) – Flag to print the result to console.

  • multi_threads (bool) – Flag to enable multi-threading for the request.

Returns:

The query result as a list of dictionaries.

Return type:

list

_get_result_data_by_request(query_url, print_to_console=False)

Internal method to get data by sending a request to the specified URL.

Parameters:
  • query_url (str) – The URL to send the request to.

  • print_to_console (bool) – Flag to print the result to console.

Returns:

The query result as a list of dictionaries.

Return type:

ResultDataParser

get_all_table_names(print_to_console=True)

Retrieves all table names from the SDWIS database.

Parameters:

print_to_console (bool) – Flag to print the table names to console.

Returns:

A list of all table names.

Return type:

list

get_all_table_names_and_descriptions(print_to_console=True)

Retrieves all table names and their descriptions from the SDWIS database.

Parameters:

print_to_console (bool) – Flag to print the table names and descriptions to console.

Returns:

A list of table names and descriptions or a dictionary if not printed.

Return type:

list or dict

get_table_column_name_by_table_name(table_name='', print_to_console=True)

Retrieves column names for a specified table.

Parameters:
  • table_name (str) – Name of the table.

  • print_to_console (bool) – Flag to print the column names to console.

Returns:

A list of column names for the specified table.

Return type:

list

get_columns_description_by_table_name(table_name='', print_to_console=True, multi_threads=False)

Retrieves descriptions for all columns of a specified table.

Parameters:
  • table_name (str) – Name of the table.

  • print_to_console (bool) – Flag to print column descriptions to console.

  • multi_threads (bool) – Flag to enable multi threads requests.

Returns:

A dictionary mapping column names to their descriptions.

Return type:

dict

get_table_data_number(table_name='')

Retrieves the number of records in a specified table.

Parameters:

table_name (str) – Name of the table.

Returns:

The total number of records in the table.

Return type:

int

get_table_first_data_by_table_name(table_name='', print_to_console=True)

Retrieves the first record of a specified table.

Parameters:
  • table_name (str) – Name of the table.

  • print_to_console (bool) – Flag to print the first record to console.

Returns:

The first record of the table.

Return type:

dict or ResultDataParser

get_table_first_n_data_by_table_name(table_name='', n=0, print_to_console=True, multi_threads=False)

Retrieves the first ‘n’ records of a specified table.

Parameters:
  • table_name (str) – Name of the table.

  • n (int) – Number of records to retrieve.

  • print_to_console (bool) – Flag to print the records to console.

  • multi_threads (bool) – Flag to enable multi-threading.

Returns:

A list of the first ‘n’ records from the table.

Return type:

list or ResultDataParser

get_data_by_conditions(table_name='', condition1='', condition2='', condition3='', print_to_console=True, only_count=False)

Retrieves data from a table based on specified conditions.

Parameters:
  • table_name (str) – Name of the table.

  • condition1 (str) – Conditions for data retrieval.

  • condition2 (str) – Conditions for data retrieval.

  • condition3 (str) – Conditions for data retrieval.

  • print_to_console (bool) – Flag to print the results to console.

  • only_count (bool) – Flag to retrieve only the count of matching records.

Returns:

List of matching records, or count of records if only_count is True.

Return type:

list or int

summarize_data_by_epa_region(table_name='', print_to_console=True, multi_threads=False)

Summarizes data by EPA region for a specified table.

Parameters:
  • table_name (str) – Name of the table.

  • print_to_console (bool) – Flag to print summary to console.

  • multi_threads (bool) – Flag to enable multi-threading.

Returns:

A list containing the data summary by EPA region.

Return type:

list

get_data_by_epa_region(table_name='', epa_region=1, print_to_console=True, only_count=False, multi_mode=False)

Retrieves data from a table filtered by a specific EPA region. SAMPLE URL: https://data.epa.gov/efservice/LCR_SAMPLE/EPA_REGION/=/01/JSON :param table_name: Name of the table. :type table_name: str :param epa_region: EPA region number. :type epa_region: int :param print_to_console: Flag to print the results to console. :type print_to_console: bool :param only_count: Flag to retrieve only the count of records. :type only_count: bool :param multi_mode: Flag to enable multi-threading. :type multi_mode: bool

Returns:

List of records or count of records if only_count is True.

Return type:

list or int

class sdwis_drink_water.ResultDataParser(data)
get_first_n_records(n)

Return the first n records from the data.

count()

Return the first n records from the data.

show()

Return the first n records from the data.

_get_column_values(key)

Extract all values for a given key from the data.

_get_column_values_with_records(key)

Extract all values for a given key from the data along with their corresponding records.

find_max(key)

Find the maximum value(s) and corresponding records for a given key.

find_min(key)

Find the minimum value(s) and corresponding records for a given key.

_parse_condition(condition)
_filter_by_condition(condition)
find_min_with_condition(key, condition)
find_max_with_condition(key, condition)
find_by_condition(condition)

Find records that match a condition like ‘KEY>value’.

get_all_keys()

Return a set of all keys in the data.

export_data(filename, format_type='csv')

Export data to a file in the specified format.

intersect_with(other_data)

Find the intersection of the current data with the specified data. :param other_data: The specified data to intersect with. :type other_data: ResultDataParser

Returns:

New instance containing the intersecting records.

Return type:

ResultDataParser

difference_with(other_data)

Find the difference between the current data and specified data. :param other_data: The specified data to compare with. :type other_data: ResultDataParser

Returns:

New instance containing the difference.

Return type:

ResultDataParser

merge_with(other_data)

Merge the current data with the specified data. :param other_data: The specified data to merge with. :type other_data: ResultDataParser

Returns:

New instance containing the merged data.

Return type:

ResultDataParser

remove_key(key_to_remove)

Remove a specified key from all data records. :param key_to_remove: The key to be removed from the data records. :type key_to_remove: str

Returns:

New instance with the key removed from all records.

Return type:

ResultDataParser

class sdwis_drink_water.EnforcementAction(print_url=False)
get_table_column_name(print_to_console=True)
get_table_columns_description(multi_threads=False, print_to_console=True)
get_table_data_number()
get_table_first_data(print_to_console=True)
get_table_first_n_data(n=0, multi_threads=False, print_to_console=True)
get_enforcement_action_data_by_conditions(condition1='', condition2='', condition3='', print_to_console=True, only_count=False)
class sdwis_drink_water.GeographicArea(print_url=False)
get_table_column_name(print_to_console=True)
get_table_columns_description(multi_threads=False, print_to_console=True)
get_table_data_number()
get_table_first_data(print_to_console=True)
get_table_first_n_data(n=0, multi_threads=False, print_to_console=True)
summarize_geographic_area_data_by_epa_region(print_to_console=True, multi_threads=False)
get_geographic_area_by_epa_region(epa_region=1, print_to_console=True, only_count=False)
get_geographic_area_data_by_conditions(condition1='', condition2='', condition3='', print_to_console=True, only_count=False)
class sdwis_drink_water.LcrSample(print_url=False)
get_table_column_name(print_to_console=True)
get_table_columns_description(multi_threads=False, print_to_console=True)
get_table_data_number()
get_table_first_data(print_to_console=True)
get_table_first_n_data(n=0, multi_threads=False, print_to_console=True)
summarize_lcr_sample_data_by_epa_region(print_to_console=True, multi_threads=False)
get_lcr_sample_by_epa_region(epa_region=1, print_to_console=True, only_count=False)
get_lcr_sample_data_by_conditions(condition1='', condition2='', condition3='', print_to_console=True, only_count=False)
class sdwis_drink_water.LcrSampleResult(print_url=False)
get_table_column_name(print_to_console=True)
get_table_columns_description(multi_threads=False, print_to_console=True)
get_table_data_number()
get_table_first_data(print_to_console=True)
get_table_first_n_data(n=0, multi_threads=False, print_to_console=True)
summarize_lcr_sample_result_data_by_epa_region(print_to_console=True, multi_threads=False)
get_lcr_sample_result_by_epa_region(epa_region=1, print_to_console=True, only_count=False)
get_lcr_sample_result_data_by_conditions(condition1='', condition2='', condition3='', print_to_console=True, only_count=False)
class sdwis_drink_water.ServiceArea(print_url=False)
get_table_column_name(print_to_console=True)
get_table_columns_description(multi_threads=False, print_to_console=True)
get_table_data_number()
get_table_first_data(print_to_console=True)
get_table_first_n_data(n=0, multi_threads=False, print_to_console=True)
get_service_area_data_by_conditions(condition1='', condition2='', condition3='', print_to_console=True, only_count=False)
class sdwis_drink_water.Treatment(print_url=False)
get_table_column_name(print_to_console=True)
get_table_columns_description(multi_threads=False, print_to_console=True)
get_table_data_number()
get_table_first_data(print_to_console=True)
get_table_first_n_data(n=0, multi_threads=False, print_to_console=True)
get_treatment_data_by_conditions(condition1='', condition2='', condition3='', print_to_console=True, only_count=False)
class sdwis_drink_water.Violation(print_url=False)
get_table_column_name(print_to_console=True)
get_table_columns_description(multi_threads=False, print_to_console=True)
get_table_data_number()
get_table_first_data(print_to_console=True)
get_table_first_n_data(n=0, multi_threads=False, print_to_console=True)
summarize_violation_data_by_epa_region(print_to_console=True, multi_threads=False)
get_violation_by_epa_region(epa_region=1, print_to_console=True, only_count=False)
get_violation_data_by_conditions(condition1='', condition2='', condition3='', print_to_console=True, only_count=False)
class sdwis_drink_water.ViolationEnfAssoc(print_url=False)
get_table_column_name(print_to_console=True)
get_table_columns_description(multi_threads=False, print_to_console=True)
get_table_data_number()
get_table_first_data(print_to_console=True)
get_table_first_n_data(n=0, multi_threads=False, print_to_console=True)
get_violation_enf_assoc_data_by_conditions(condition1='', condition2='', condition3='', print_to_console=True, only_count=False)
class sdwis_drink_water.WaterSystem(print_url=False)
get_table_column_name(print_to_console=True)
get_table_columns_description(multi_threads=False, print_to_console=True)
get_table_data_number()
get_table_first_data(print_to_console=True)
get_table_first_n_data(n=0, multi_threads=False, print_to_console=True)
summarize_water_system_data_by_epa_region(print_to_console=True, multi_threads=False)
get_water_system_by_epa_region(epa_region=1, print_to_console=True, only_count=False)
get_water_system_data_by_conditions(condition1='', condition2='', condition3='', print_to_console=True, only_count=False)
class sdwis_drink_water.WaterSystemFacility(print_url=False)
get_table_column_name(print_to_console=True)
get_table_columns_description(multi_threads=False, print_to_console=True)
get_table_data_number()
get_table_first_data(print_to_console=True)
get_table_first_n_data(n=0, multi_threads=False, print_to_console=True)
summarize_water_system_facility_data_by_epa_region(print_to_console=True, multi_threads=False)
get_water_system_facility_by_epa_region(epa_region=1, print_to_console=True, only_count=False)
get_water_system_facility_data_by_conditions(condition1='', condition2='', condition3='', print_to_console=True, only_count=False)
sdwis_drink_water.tabulate_for_jupyter(table)
sdwis_drink_water.print_column_description(result_dict)
sdwis_drink_water.print_result_data(result_data)
sdwis_drink_water.print_columns(column_names)
sdwis_drink_water.api
sdwis_drink_water.lcr_sample
sdwis_drink_water.lcr_sample_result
sdwis_drink_water.violation
sdwis_drink_water.water_system
sdwis_drink_water.water_system_facility
sdwis_drink_water.geographic_area
sdwis_drink_water.enforcement_action
sdwis_drink_water.service_area
sdwis_drink_water.treatment
sdwis_drink_water.violation_enf_assoc