Documentation Status

Utilities to handle PMF data

Contents

Examples

Load the PMF

Given an output folder in /home/myname/Documents/PMF/GRE-cb/MobilAir_woOrga that looked like:

MobilAir_woOrga
├── GRE-cb_BaseErrorEstimationSummary.xlsx
├── GRE-cb_base.xlsx
├── GRE-cb_boot.xlsx
├── GRE-cb_ConstrainedDISPest.dat
├── GRE-cb_ConstrainedDISPres1.txt
├── GRE-cb_ConstrainedDISPres2.txt
├── GRE-cb_ConstrainedDISPres3.txt
├── GRE-cb_ConstrainedDISPres4.txt
├── GRE-cb_ConstrainedErrorEstimationSummary.xlsx
├── GRE-cb_Constrained.xlsx
├── GRE-cb_diagnostics.xlsx
├── GRE-cb_DISPest.dat
├── GRE-cb_DISPres1.txt
├── GRE-cb_DISPres2.txt
├── GRE-cb_DISPres3.txt
├── GRE-cb_DISPres4.txt
├── GRE-cb_Gcon_profile_boot.xlsx
├── GRE-cb_rotational_comments.txt
└── GRE-cb_sourcecontributions.xls

in order to convert them to a PMF object, run the following command :

from py4pm.pmfutilities import PMF

grecb = PMF(site="GRE-cb", BDIR="/home/myname/Documents/PMF/GRE-fr/MobilAir_woOrga")

Now, grecb is an instance of a PMF object, and has a lot of reader and ploter.

Read the data

Organization

The read class of the PMF object give access to different reader to retreive data from the different xlsx files outputed by the EPA PMF5 software.

They all start by read_base* or read_constrained* name, for the base and constrained run, respectively.

The special method read_metadata is used to retrieve the factors names and species names from the _base.xlsx files, and use them everywhere else. It also try to set the total variable name if any (one of PM10, PM2.5, PMrecons, PM10rec, PM10recons, otherwise try to guess), used to convert unit and to be the default variable to plot.

For now, the following readers are implemented :

Contribution

The contributions of the factors (G matrix) are read from the _base.xlsx and _Constrained.xlsx files, sheet contributions. You can read them using the reader read_base_contributions and read_constrained_contributions:

grecb.read.read_base_contributions()
grecb.read.read_constrained_contributions()

And now, the grecb object has a dfcontrib_b and dfcontrib_c attributes (_b for the base run, _c for the constrained run):

>>> grecb.dfcontrib_c

            Sulfate-rich  Nitrate-rich  ...  Biomass burning  Sea/road salt  Mineral dust
Date                                    ...
2017-02-28      0.321580     -0.105980  ...          0.19419       0.606290      0.182880
2017-03-03      0.429480     -0.038802  ...          0.61595       0.050129      0.382890
2017-03-06     -0.098123     -0.151530  ...          0.53346       4.636400      0.272410
2017-03-09      0.643500     -0.002527  ...          1.09060       0.153200      1.083600
2017-03-12      0.664090      0.308390  ...          1.70740      -0.200000      0.846930

which is the G matrix, in normalized unit.

Chemical profiles

The chemical profiles (or simply profiles) is the F matrix of the PMF (in µg/m³) and are read from the _base.xslx and _Constrained.xlsx files, sheet Profiles. You can read them using the reader read_base_profiles and read_constrained_profiles:

grecb.read.read_base_profile()
grecb.read.read_constrained_profile()

and grecb has now a not null dfprofiles_b and dfprofiles_c dataframe :

>>> grecb.dfprofiles_c
              Sulfate-rich  Nitrate-rich  ...  Biomass burning  Sea/road salt  Mineral dust
specie                                    ... 
PMrecons          4.402500      2.421300  ...         3.027900       0.364280      2.009600
OC*               1.225300      0.000000  ...         1.308900       0.041038      0.428110
EC                0.162970      0.000000  ...         0.347050       0.019199      0.030703
Cl-               0.000000      0.002425  ...         0.026819       0.109070      0.000000
NO3-              0.300660      1.702200  ...         0.093396       0.000000      0.000000
SO42-             0.977680      0.010441  ...         0.092800       0.032969      0.189890
...                    ...           ...  ...             ...            ...           ...

The values are in µg/m³.

Uncertainties
Summary

You can also read the bootstrap and DISP results from the _BaseErrorEstimationSummary.xlsx and _ConstrainedErrorEstimationSummary.xlsx files.

grecb.read.read_base_summary()
grecb.read.read_constrained_summary()

and now, you have access to df_uncertainties_summary_b and df_uncertainties_summary_c: the summaries of the BS, DISP and BS-DISP uncertainties for each profiles and species.

>>> grecb.df_uncertainties_summary_c
                       Constrained base run    BS 5th  BS median   BS 95th  BS-DISP 5th  BS-DISP average  BS-DISP 95th  DISP Min  DISP average  DISP Max
profile      specie
Sulfate-rich PMrecons              4.402500  4.261867   4.511374  4.709612          NaN              NaN           NaN  3.788500      4.337850  4.887200
             OC*                   1.225300  0.822712   1.161025  1.702325          NaN              NaN           NaN  0.988480      1.211690  1.434900
             EC                    0.162970  0.051262   0.211147  0.436615          NaN              NaN           NaN  0.121070      0.213030  0.304990
             Cl-                   0.000000  0.000000   0.000000  0.000000          NaN              NaN           NaN  0.000000      0.006156  0.012311
             NO3-                  0.300660  0.000000   0.346984  0.563892          NaN              NaN           NaN  0.068862      0.260436  0.452010
...                                     ...       ...        ...       ...          ...              ...           ...       ...           ...       ...
Mineral dust Se                    0.000008  0.000000   0.000012  0.000029          NaN              NaN           NaN  0.000000      0.000023  0.000046
             Sn                    0.000000  0.000000   0.000032  0.000154          NaN              NaN           NaN  0.000000      0.000069  0.000139
             Ti                    0.002545  0.001121   0.001750  0.002546          NaN              NaN           NaN  0.002881      0.003464  0.004047
             V                     0.000265  0.000063   0.000145  0.000249          NaN              NaN           NaN  0.000265      0.000278  0.000290
             Zn                    0.000218  0.000000   0.000030  0.001286          NaN              NaN           NaN  0.000000      0.000177  0.000354
All bootstrap profiles

If you want to retreive the individual bootstrap results, read from _boot.xlsx and _Gcon_profile_boot.xlsx:

grecb.read.read_base_bootstrap()
grecb.read.read_constrained_bootstrap()

and now you have access to dfBS_profile_b and dfBS_profile_c, which are all the bootstrap chemical profiles for the base and constrained run, respectively.

>>> grecb.dfBS_profile_c
                              Boot0     Boot1     Boot2  ...    Boot97    Boot98   Boot100
specie   profile                                         ... 
PMrecons Sulfate-rich      4.412330  2.259480  4.330630  ...  3.191810  4.041220  3.109190
         Nitrate-rich      2.462740  2.254470  2.609910  ...  2.068200  2.349640  2.404520
         Industrial        0.259120  0.289952  0.474484  ...  0.214298  0.206250  0.875102
         Primary biogenic  0.579702  1.437820  0.633064  ...  1.290640  0.358833  0.296207
         Primary traffic   1.862990  1.178150  1.711440  ...  1.171830  1.974060  1.678320
...                             ...       ...       ...  ...       ...       ...       ...
Zn       Marine SOA        0.000826  0.002239  0.001256  ...  0.000265  0.000389  0.000436
         Aged seasalt      0.000000  0.000000  0.000000  ...  0.000814  0.000304  0.001018
         Biomass burning   0.002404  0.001699  0.002053  ...  0.002012  0.002270  0.001188
         Sea/road salt     0.000625  0.000457  0.000848  ...  0.000234  0.000596  0.000187
         Mineral dust      0.000000  0.000000  0.000000  ...  0.001355  0.000000  0.000000

as well as dfbootstrap_mapping_b and dfbootstrap_mapping_c, which are the tables of the mapping between reference and BS factors:

>>> grecb.dfbootstrap_mapping_c
                    Sulfate-rich Nitrate-rich Industrial Primary biogenic Primary traffic Marine SOA Aged seasalt Biomass burning Sea/road salt Mineral dust unmapped
BF-Sulfate-rich               94            0          2                0               3          0            0               0             0            0        0
BF-Nitrate-rich                0           99          0                0               0          0            0               0             0            0        0
BF-Industrial                  0            0         99                0               0          0            0               0             0            0        0
BF-Primary biogenic            0            0          0               99               0          0            0               0             0            0        0
BF-Primary traffic             0            0          0                0              99          0            0               0             0            0        0
BF-Marine SOA                  0            0          0                0               1         98            0               0             0            0        0
BF-Aged seasalt                0            0          0                0               0          0           99               0             0            0        0
BF-Biomass burning             0            0          0                0               0          0            0              99             0            0        0
BF-Sea/road salt               0            0          0                0               0          0            0               0            99            0        0
BF-Mineral dust                0            0          0                0               0          0            0               0             0           99        0

Plot utilities

Chemical profile (per microgram of total variable)
Chemical profile (in percentage of the sum of each species)
Contribution time series and uncertainties
grecb.plot.plot_contrib(profiles=["Primary biogenic"])

will produce the following graph

Time series of POA

Primary biogenic factor contribution to the total variable.

Since the EPA PMF5 does not output the chemical profile (F) matrix of the boostrap, the uncertainties is estimated by computing the species concentration given the F matrix of the reference run and the G matrix of the bootstrap run. As a result, the output is “hacky” since in the bootstrap method, bith the F and G matrix are changing. If you want to remove them, just pass BS=False to the method.

Utilities

Convert to cubic meter

In order to have the contributions in µg/m³, which is given by G⋅F, we need to know both the chemical profile F and the contribution G. And we can easily reconstruct the time serie in µg/m³ of each specie for every profile by simple multiplication of the timeserie by the concentration in the chemical profile. Since this is a very often computation, the method to_cubic_metter does just that :

>>> grecb.to_cubic_metter()
            Sulfate-rich  Nitrate-rich  ... Biomass burning  Sea/road salt  Mineral dust
Date                                    ...
2017-02-28      1.415756     -0.256609  ...        0.587988       0.220859      0.367516
2017-03-03      1.890786     -0.093951  ...        1.865035       0.018261      0.769456
2017-03-06     -0.431987     -0.366900  ...        1.615264       1.688948      0.547435
2017-03-09      2.833009     -0.006120  ...        3.302228       0.055808      2.177603
2017-03-12      2.923656      0.746705  ...        5.169836      -0.072856      1.701991
...                  ...           ...  ...             ...            ...           ...

Note that to_cubic_metter use by default the constrained run, all the profile and the total variable, but you can specify other conditions (see the doc of this method).

Relative contributions of species to the total mass

By default, the profile matrix F is in µg/m³. But it’s often convenient to know the relative contribution of each species to the “total variable” mass (for instance, percent of contribution of each specie to the $PM_10$). This result is the ratio of each species in a profile to the total variable.

The method to_relative_mass conveniently handle it, and return you a new dataframe:

>>> grecb.to_relative_mass()
              Sulfate-rich  Nitrate-rich  ... Biomass burning  Sea/road salt  Mineral dust
specie                                    ...
PMrecons          1.000000      1.000000  ...        1.000000       1.000000      1.000000
OC*               0.278319      0.000000  ...        0.432280       0.112655      0.213032
EC                0.037018      0.000000  ...        0.114617       0.052704      0.015278
Cl-               0.000000      0.001002  ...        0.008857       0.299413      0.000000
NO3-              0.068293      0.703011  ...        0.030845       0.000000      0.000000
SO42-             0.222074      0.004312  ...        0.030648       0.090505      0.094491
...                    ...           ...  ...             ...            ...           ...

The values are now in % of the PMrecons mass.

Relative contribution of the factor for each species

Another usefull information is how much a given specie is apportioned by all factors, denoted as the total specie sum graph in the EPA PMF5 software. It is the amount of a given specie in a factor divided by the sum of this specie in all factors.

The method get_total_specie_sum return this value for every species in all profiles:

>>> grecb.get_total_specie_sum()
              Sulfate-rich  Nitrate-rich  ...  Biomass burning  Sea/road salt  Mineral dust
specie                                    ...
PMrecons         27.520080     15.135575  ...        18.927439       2.277119     12.562033
OC*              30.440474      0.000000  ...        32.517372       1.019519     10.635658
EC               14.525003      0.000000  ...        30.931475       1.711146      2.736462
Cl-               0.000000      1.506558  ...        16.659544      67.752580      0.000000
NO3-             11.676593     66.107550  ...         3.627177       0.000000      0.000000
SO42-            66.571611      0.710942  ...         6.318883       2.244906     12.929878
...                    ...           ...  ...             ...            ...           ...

In this example, the Biomass burning factor apportion 18% of the total PMrecons, 32% of the OC*, 30% of the EC, etc. We also see that the NO3- is mainly apportioned by the Nitrate-rich factor (66%).

API

py4pm.chemutilities module

py4pm.chemutilities.format_ions(text)[source]
py4pm.chemutilities.get_OC_from_OC_star_and_organic(df)[source]

Re-compute OC taking into account the organic species

OC = OC* + sum(eqC_sp)

py4pm.chemutilities.get_sample_where(sites=None, date_min=None, date_max=None, species=None, min_sample=None, particle_size=None, con=None)[source]

Get dataframe that meet conditions

Sites

TODO

Date_min

TODO

Date_max

TODO

Min_sample

int, minimum samples size

Particle_size

Con

sqlite3 connection

Returns

TODO

py4pm.chemutilities.get_site_typology()[source]
py4pm.chemutilities.get_sourceColor(source=None)[source]

Return the hexadecimal color of the source(s)

If no option, then return the whole dictionary

sourcestr

The name of the source

py4pm.chemutilities.get_sourcesCategories(profiles)[source]

Get the sources category according to the sources name.

Ex. Aged sea salt → Aged_sea_salt

Profiles

list

Returns

list

class py4pm.chemutilities.plot[source]

Bases: object

mainCompentOfPM(dateStart, dateEnd, seasonal=False, savefig=False, savedir=None)[source]

Plot a stacked bar plot of the different constitutant of the PM

Parameters
  • station (str) – name of the station

  • dateEnd (dateStart,) – starting and ending date

  • seasonal (boolean, default False) – Either to make separate graph per season

  • savefig (boolean, default False) – Save the fig in png and pdf

  • savedir (str path, default None) – Where to save the figures

what_do_we_have(date_min=None, date_max=None, species=None, min_sample=None, particle_size=None, con=None)[source]

TODO: Docstring for what_do_we_have.

Sites

TODO

Date_min

TODO

Date_max

TODO

Species

TODO

Min_sample

TODO

Con

TODO

Returns

TODO

py4pm.chemutilities.replace_QL(dftmp, species=None, conn=None)[source]

Replace the -1 and -2 in the dataframe by the appropriate DL and QL values

The change are done inplace.

Dftmp

pandas DataFrame

py4pm.dateutilities module

py4pm.dateutilities.add_season(df, month=True, month_to_season=None)[source]

Add a season column to the DataFrame df.

Parameters
  • df (Pandas DataFrame.) – The DataFrame to work with.

  • month (add month number, default True) –

Returns

dfnew

Return type

a new pandas DataFrame with a ‘season’ columns.

py4pm.dateutilities.format_xaxis_timeseries(ax)[source]

Format the x-axis timeseries with minortick = month and majortick=year

Ax

the ax to format

py4pm.deltaTool module

class py4pm.deltaTool.LegendTitle(text_props=None)[source]

Bases: object

legend_artist(legend, orig_handle, fontsize, handlebox)[source]
py4pm.deltaTool.compute_PD(df1, df2, factor1=None, factor2=None, isRelativeMass=True)[source]

Compute the PD of the factors factor1 and factor2 in the profile df1`and `df2.

py4pm.deltaTool.compute_SID(df1, df2, factor1=None, factor2=None, isRelativeMass=True)[source]

Compute the SID of the factors factor1 and factor2 in the profile df1`and `df2.

py4pm.deltaTool.get_all_SID_PD(PMF_profile, stations, factor2=None, isRelativeMass=False)[source]

Compute the SID and PD for all profiles in PMF_profile for the stations stations.

py4pm.deltaTool.get_profile_from_PMF(pmfs)[source]

Get a profile matrix from a list of PMF object

Pmfs

TODO

Returns

TODO

py4pm.deltaTool.plot_all_stations_similarity_by_source(PMF_profile)[source]

Plot all individual pair of profile for each common source.

py4pm.deltaTool.plot_deltatool_pretty(ax)[source]

Format the given ax to conform with the “deltatool-like” visualization.

py4pm.deltaTool.plot_relativeMass(PMF_profile, source='Biomass burning', isRelativeMass=True, totalVar='PM10', naxe=1, site_typologie=None)[source]
py4pm.deltaTool.plot_similarity_profile(SID, PD, err='ci', plotAll=False)[source]

Plot a point in the SID/PD space (+/-err) for all profile in SID and PD.

SID : DataFrame with index (factor, station) and column (station) : the SID matrix PD : DataFrame with index (factor, station) and column (station) : the PD matrix err : “ci” or “sd”, the type of error for xerr and yerr. plotAll: boolean. Either or not plot each pair of profile.

Returns

  • similarity (pd.DataFrame) – columns: x, y, xerr, yerr, n index: profiles

  • handles_labels (tuple of handles and labels) – legend of the plot

py4pm.deltaTool.plot_similarityplot(PMF_profile, station1, station2, source1, source2=None, SID=None, PD=None, isRelativeMass=False, ax=None, plot_kw={})[source]

Plot the distance in the SID/PD space of 2 profiles for 2 stations.

py4pm.deltaTool.save4deltaTool(contrib, profile)[source]
py4pm.deltaTool.to_relativeMass(df, totalVar='PM10')[source]

Normalize the profile df to the relative mass with regard to the totalVar (=PM10 or PM2.5).

py4pm.pmfutilities module

class py4pm.pmfutilities.CachedAccessor(name, accessor)[source]

Bases: object

Custom property-like object (descriptor) for caching accessors.

Parameters
  • name (str) – The namespace this will be accessed under, e.g. df.foo

  • accessor (cls) – The class with the extension methods. The class’ __init__ method should expect one of a Series, DataFrame or Index as the single argument data

class py4pm.pmfutilities.PMF(site, BDIR, program=None)[source]

Bases: object

PMF are able to read file from US EPA PMF5.0 software output (in xlsx format), then parse them in a more handy format (pandas DataFrame). Several plot utilities are also available.

get_seasonal_contribution(specie=None, annual=True, normalize=True, constrained=True)[source]

Get a dataframe of seasonal contribution

Parameters
  • specie (default None) –

  • annual (default True) –

  • normalize (default True) –

  • constrained (default True) –

Returns

df

Return type

seasonal contribution

get_total_specie_sum(constrained=True)[source]

Return the total specie sum profiles in %

Parameters

constrained (boolean, default True) – use the constrained run or not

Returns

df – The normalized species sum per profiles

Return type

pd.DataFrame

plot

alias of PlotterAccessor

print_uncertainties_summary(constrained=True, profiles=None, species=None)[source]

Get the uncertainties given by BS, BS-DISP and DISP for the given profiles and species

Parameters
  • constrained (boolean, True) – Use the constrained run (False for the base run)

  • profiles (list of str) – list of profiles, default all profiles

  • species (list of str) – list of species, default all species

Returns

df – BS, DISP and BS-DISP ranges

Return type

pd.DataFrame

read

alias of ReaderAccessor

recompute_new_species(specie)[source]

Recompute a specie given the other species. For instance, recompute OC from OC* and a list of organic species.

It modify inplace both dfprofile_b and dfprofile_c, and update self.species.

Parameters

specie (str in ["OC",]) –

rename_profile_to_profile_category()[source]

Rename the factor profile name to match the category

replace_totalVar(newTotalVar)[source]

replace the total var to all dataframe

NewTotalVar

TODO

Returns

TODO

to_cubic_meter(constrained=True, specie=None, profiles=None)[source]

Convert the contribution in cubic meter for the given specie

Parameters
  • specie (str, the specie, default totalVar) –

  • profiles (list of profile, default all profiles) –

Returns

df

Return type

dataframe

to_relative_mass(constrained=True, species=None, profiles=None)[source]

Compute the factor profile relative mass (i.e. each species divided by the totalVar mass)

Parameters
  • constrained (TODO) –

  • species (TODO) –

  • profiles (TODO) –

class py4pm.pmfutilities.PlotterAccessor(data)[source]

Bases: object

Accessor class for the PMF class with all plotter methods.

plot_all_profiles(constrained=True, profiles=None, specie=None, BS=True, DISP=True, BSDISP=False, plot_save=False, savedir=None)[source]

TODO: Docstring for plot_all_profiles.

Parameters
  • constrained (Boolean, default True) – Either to use the constrained run or the base one

  • profiles (list of string) – Profiles to plot

  • species

  • DISP, BSDISP} ({BS,) – Use them as error estimation

  • plot_save (boolean, default False) – Either or not saving the plot

  • savedir (str) – Path to save the plot

plot_contrib(dfBS=None, dfDISP=None, dfcontrib=None, profiles=None, specie=None, constrained=True, plot_save=False, savedir=None, BS=True, DISP=True, BSDISP=False, new_figure=True, **kwargs)[source]

Plot temporal contribution in µg/m3.

Parameters
  • df (pd.DataFrame, default self.dfBS_profile_c) – DataFrame with multiindex [species, profile] and an arbitrary number of column.

  • dfcontrib (pd.DataFrame, default self.dfcontrib_c) – Profile as column and specie as index.

  • profiles (list of string, default self.profiles) – profile to plot (one figure per profile)

  • specie (string, default totalVar.) – specie to plot (y-axis)

  • plot_save (boolean, default False) – Save the graph in savedir.

  • savedir (string) – directory to save the plot

plot_per_microgramm(df=None, constrained=True, profiles=None, species=None, plot_save=False, savedir=None)[source]

Plot profiles in concentration unique (µg/m3).

Parameters
  • df (DataFrame with multiindex [species, profile] and an arbitrary) – number of column. Default to dfBS_profile_c.

  • constrained (Boolean, either to use the constrained run or the base run) –

  • profiles (list of str, profile to plot (one figure per profile)) –

  • species (list of str, specie to plot (x-axis)) –

  • plot_save (boolean, default False. Save the graph in savedir.) –

  • savedir (string, directory to save the plot.) –

plot_seasonal_contribution(constrained=True, dfcontrib=None, dfprofiles=None, profiles=None, specie=None, plot_save=False, savedir=None, annual=True, normalize=True, ax=None, barplot_kwarg={})[source]

Plot the relative contribution of the profiles.

Parameters
  • dfcontrib (DataFrame with contribution as column and date as index.) –

  • dfprofiles (DataFrame with profile as column and specie as index.) –

  • profiles (list, profile to plot (one figure per profile)) –

  • specie (string, default totalVar. specie to plot) –

  • plot_save (boolean, default False. Save the graph in savedir.) –

  • savedir (string, directory to save the plot.) –

  • annual (plot annual contribution) –

  • normalize (plot relative contribution or absolute contribution.) –

Returns

df

Return type

DataFrame

plot_stacked_contribution(constrained=True, order=None, plot_kwargs=None)[source]

Plot a stacked plot for the contribution

Parameters
  • constrained (TODO) –

  • order (TODO) –

  • plot_kwargs (TODO) –

plot_stacked_profiles(constrained=True)[source]

plot the repartition of the species among the profiles, normalized to 100%

Parameters

constrained (boolean, default True) – use the constrained run or not

Returns

ax

Return type

the axe

plot_totalspeciesum(df=None, profiles=None, species=None, constrained=True, plot_save=False, savedir=None, **kwargs)[source]

Plot profiles in percentage of total specie sum (%).

Parameters
  • df (DataFrame with multiindex [species, profile] and an arbitrary) – number of column. Default to dfBS_profile_c.

  • profiles (list, profile to plot (one figure per profile)) –

  • species (list, specie to plot (x-axis)) –

  • plot_save (boolean, default False. Save the graph in savedir.) –

  • savedir (string, directory to save the plot.) –

class py4pm.pmfutilities.ReaderAccessor(data)[source]

Bases: object

Accessor class for the PMF class with all reader methods.

read_base_bootstrap()[source]

Read the “base” bootstrap result from the file: ‘_boot.xlsx’ and add :

  • self.dfBS_profile_b: all mapped profile

  • self.dfbootstrap_mapping_b: table of mapped profiles

read_base_contributions()[source]

Read the “base” contributions result from the file: ‘_base.xlsx’, sheet “Contributions”, and add :

  • self.dfcontrib_b: base factors contribution

read_base_profiles()[source]

Read the “base” profiles result from the file: ‘_base.xlsx’, sheet “Profiles”, and add :

  • self.dfprofiles_b: constrained factors profile

read_base_uncertainties_summary()[source]

Read the _BaseErrorEstimationSummary.xlsx file and add:

  • self.df_uncertainties_summary_b : uncertainties from BS, DISP and BS-DISP

read_constrained_bootstrap()[source]

Read the “base” bootstrap result from the file: ‘_Gcon_profile_boot.xlsx’ and add :

  • self.dfBS_profile_c: all mapped profile

  • self.dfbootstrap_mapping_c: table of mapped profiles

read_constrained_contributions()[source]

Read the “constrained” contributions result from the file: ‘_Constrained.xlsx’, sheet “Contributions”, and add :

  • self.dfcontrib_c: constrained factors contribution

read_constrained_profiles()[source]

Read the “constrained” profiles result from the file: ‘_Constrained.xlsx’, sheet “Profiles”, and add :

  • self.dfprofiles_c: constrained factors profile

read_constrained_uncertainties_summary()[source]

Read the _ConstrainedErrorEstimationSummary.xlsx file and add :

  • self.df_uncertainties_summary_b : uncertainties from BS, DISP and BS-DISP

read_metadata()[source]

Get profiles, species and co

It add a totalVariable (by default one of “PM10”, “PM2.5”, “PMrecons” or “PM10recons”, “PM10rec”). Otherwise, try to guess (variable with “PM” on its name).

Description

py4pm is a set of utilities to handle the xlsx output of EPA PMF5 software in python.

This project started because I needed to run several PMF for my PhD and also needed to run some computation on these results. The raw output of the EPA PMF5 software is a bit messy and hard to understand at a first glance, and copy/pasting xlsx file is not my taste… So I ended developping this tools for handling the tasks of maping the xlsx output to nice python objects, on which I can easily run some computation.

Since I needed to plot the results afterward, I also added some plot utilities in this package. It then has built in support for ploting :

  • chemical profile (both absolute and normalized)

  • species repartition among factor

  • timeserie contribution (for all species and profiles)

  • uncertainties plots (Bootstrap and DISP)

  • seasonal contribution

Moreover, this package include some of the DeltaTool utilities developed by the JRC, notably for profile chemical similarities between several PMF results.