Title: | Multidimensional Queries Enriched with Geographic Data |
---|---|
Description: | Multidimensional systems allow complex queries to be carried out in an easy way. The geographical dimension, together with the temporal dimension, plays a fundamental role in multidimensional systems. Through this package, vector geographic data layers can be associated to the attributes of geographic dimensions, so that the results of multidimensional queries can be obtained directly as vector layers. The multidimensional structures on which we can define the queries can be created from a flat table or imported directly using functions from this package. |
Authors: | Jose Samos [aut, cre] , Universidad de Granada [cph] |
Maintainer: | Jose Samos <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.2.2.9000 |
Built: | 2025-01-25 04:19:51 UTC |
Source: | https://github.com/josesamos/geomultistar |
multistar
To add a dimension table to a multistar
object, we must indicate the name
that we give to the dimension, the tibble
that contains the data and the
name of the attribute corresponding to the table primary key.
add_dimension( ms, dimension_name = NULL, dimension_table = NULL, dimension_key = NULL, fact_name = NULL, fact_key = NULL, key_as_data = FALSE ) ## S3 method for class 'multistar' add_dimension( ms, dimension_name = NULL, dimension_table = NULL, dimension_key = NULL, fact_name = NULL, fact_key = NULL, key_as_data = FALSE )
add_dimension( ms, dimension_name = NULL, dimension_table = NULL, dimension_key = NULL, fact_name = NULL, fact_key = NULL, key_as_data = FALSE ) ## S3 method for class 'multistar' add_dimension( ms, dimension_name = NULL, dimension_table = NULL, dimension_key = NULL, fact_name = NULL, fact_key = NULL, key_as_data = FALSE )
ms |
A |
dimension_name |
A string, name of dimension table. |
dimension_table |
A |
dimension_key |
A string, name of the dimension primary key. |
fact_name |
A string, name of fact table. |
fact_key |
A string, name of the dimension foreign key. |
key_as_data |
A boolean, define the primary key as an attribute of the dimension accessible in queries? |
We cannot add a dimension without defining a correspondence with one of the
multistar
's fact tables. We have to define the name of the fact table and
the name of its foreign key. The referential integrity of the instances of
the facts is checked.
The attribute that is used as the primary key will no longer be accessible
for queries (its function is considered to be exclusively related to facts).
If you want to use it for queries, it must be explicitly indicated by the
boolean parameter key_as_data
.
A multistar
.
Other multistar functions:
add_facts()
,
multistar()
,
relate_dimension()
ms <- multistar() |> add_facts( fact_name = "mrs_age", fact_table = mrs_fact_age, measures = "n_deaths", nrow_agg = "count" ) |> add_facts( fact_name = "mrs_cause", fact_table = mrs_fact_cause, measures = c("pneumonia_and_influenza_deaths", "other_deaths"), nrow_agg = "nrow_agg" ) |> add_dimension( dimension_name = "where", dimension_table = mrs_where, dimension_key = "where_pk", fact_name = "mrs_age", fact_key = "where_fk" ) |> add_dimension( dimension_name = "when", dimension_table = mrs_when, dimension_key = "when_pk", fact_name = "mrs_age", fact_key = "when_fk", key_as_data = TRUE ) |> add_dimension( dimension_name = "who", dimension_table = mrs_who, dimension_key = "who_pk", fact_name = "mrs_age", fact_key = "who_fk" )
ms <- multistar() |> add_facts( fact_name = "mrs_age", fact_table = mrs_fact_age, measures = "n_deaths", nrow_agg = "count" ) |> add_facts( fact_name = "mrs_cause", fact_table = mrs_fact_cause, measures = c("pneumonia_and_influenza_deaths", "other_deaths"), nrow_agg = "nrow_agg" ) |> add_dimension( dimension_name = "where", dimension_table = mrs_where, dimension_key = "where_pk", fact_name = "mrs_age", fact_key = "where_fk" ) |> add_dimension( dimension_name = "when", dimension_table = mrs_when, dimension_key = "when_pk", fact_name = "mrs_age", fact_key = "when_fk", key_as_data = TRUE ) |> add_dimension( dimension_name = "who", dimension_table = mrs_who, dimension_key = "who_pk", fact_name = "mrs_age", fact_key = "who_fk" )
multistar
To add a fact table to a multistar
object, we must indicate the name that
we give to the facts, the tibble
that contains the data and a vector of
attribute names corresponding to the measures.
add_facts( ms, fact_name = NULL, fact_table = NULL, measures = NULL, agg_functions = NULL, nrow_agg = "nrow_agg" ) ## S3 method for class 'multistar' add_facts( ms, fact_name = NULL, fact_table = NULL, measures = NULL, agg_functions = NULL, nrow_agg = "nrow_agg" )
add_facts( ms, fact_name = NULL, fact_table = NULL, measures = NULL, agg_functions = NULL, nrow_agg = "nrow_agg" ) ## S3 method for class 'multistar' add_facts( ms, fact_name = NULL, fact_table = NULL, measures = NULL, agg_functions = NULL, nrow_agg = "nrow_agg" )
ms |
A |
fact_name |
A string, name of fact table. |
fact_table |
A |
measures |
A vector of measure names. |
agg_functions |
A vector of aggregation function names. If none is indicated, the default is SUM. Additionally they can be MAX or MIN. |
nrow_agg |
A string, measurement name for the number of rows aggregated. If it does not exist, it is added to the table. |
Associated with each measurement, an aggregation function is required, which by default is SUM. It that can be SUM, MAX or MIN. Mean is not considered among the possible aggregation functions: The reason is that calculating the mean by considering subsets of data does not necessarily yield the mean of the total data.
An additional measurement, nrow_agg
, corresponding to the number of
aggregated rows is always added which, together with SUM, allows us to obtain
the mean if needed. As the value of this parameter, you can specify an
attribute of the table or the name that you want to assign to it (if it does
not exist, it is added to the table).
A multistar
.
Other multistar functions:
add_dimension()
,
multistar()
,
relate_dimension()
ms <- multistar() |> add_facts( fact_name = "mrs_age", fact_table = mrs_fact_age, measures = "n_deaths", nrow_agg = "count" ) |> add_facts( fact_name = "mrs_cause", fact_table = mrs_fact_cause, measures = c("pneumonia_and_influenza_deaths", "other_deaths"), nrow_agg = "nrow_agg" )
ms <- multistar() |> add_facts( fact_name = "mrs_age", fact_table = mrs_fact_age, measures = "n_deaths", nrow_agg = "count" ) |> add_facts( fact_name = "mrs_cause", fact_table = mrs_fact_cause, measures = c("pneumonia_and_influenza_deaths", "other_deaths"), nrow_agg = "nrow_agg" )
Defines a geographic attributes in two possible ways: Associates the
instances of attributes of the geographic dimension with the instances of a
geographic layer or defines it from the geometry of another previously
defined geographic attribute. Multiple attributes can be specified in the
attribute
parameter.
define_geoattribute( gms, dimension = NULL, attribute = NULL, additional_attributes = NULL, from_layer = NULL, by = NULL, from_attribute = NULL ) ## S3 method for class 'geomultistar' define_geoattribute( gms, dimension = NULL, attribute = NULL, additional_attributes = NULL, from_layer = NULL, by = NULL, from_attribute = NULL )
define_geoattribute( gms, dimension = NULL, attribute = NULL, additional_attributes = NULL, from_layer = NULL, by = NULL, from_attribute = NULL ) ## S3 method for class 'geomultistar' define_geoattribute( gms, dimension = NULL, attribute = NULL, additional_attributes = NULL, from_layer = NULL, by = NULL, from_attribute = NULL )
gms |
A |
dimension |
A string, dimension name. |
attribute |
A vector, attribute names. |
additional_attributes |
A vector, attribute names. |
from_layer |
A |
by |
a vector of correspondence of attributes of the dimension with the
|
from_attribute |
A string, attribute name. |
If defined from a layer (from_layer
parameter), additionally the attributes
used for the join between the tables (dimension and layer tables) must be
indicated (by
parameter).
If defined from another attribute, it should have a finer granularity, to obtain the result by grouping its instances.
If no value is indicated in the attribute
parameter, it is defined for all
those attributes of the dimension that do not have any previous definition,
they are obtained from the attribute indicated in the from_attribute
parameter.
A geomultistar
object.
Other geo functions:
geomultistar()
,
get_empty_geoinstances()
,
run_geoquery()
gms <- geomultistar(ms = ms_mrs, geodimension = "where") |> define_geoattribute( attribute = "city", from_layer = usa_cities, by = c("city" = "city", "state" = "state") ) gms <- gms |> define_geoattribute(attribute = c("region", "all_where"), from_attribute = "city") gms <- gms |> define_geoattribute(from_attribute = "city") gms <- gms |> define_geoattribute(attribute = "all_where", from_layer = usa_nation)
gms <- geomultistar(ms = ms_mrs, geodimension = "where") |> define_geoattribute( attribute = "city", from_layer = usa_cities, by = c("city" = "city", "state" = "state") ) gms <- gms |> define_geoattribute(attribute = c("region", "all_where"), from_attribute = "city") gms <- gms |> define_geoattribute(from_attribute = "city") gms <- gms |> define_geoattribute(attribute = "all_where", from_layer = usa_nation)
dimensional_query
S3 classAn empty dimensional_query
object is created where you can select fact
measures, dimension attributes and filter dimension rows.
dimensional_query(ms = NULL)
dimensional_query(ms = NULL)
ms |
A |
A dimensional_query
object.
Other query functions:
filter_dimension()
,
run_query()
,
select_dimension()
,
select_fact()
# ms_mrs <- ct_mrs |> # constellation_as_multistar() # dq <- dimensional_query(ms_mrs)
# ms_mrs <- ct_mrs |> # constellation_as_multistar() # dq <- dimensional_query(ms_mrs)
Allows you to define selection conditions for dimension rows.
filter_dimension(dq, name = NULL, ...) ## S3 method for class 'dimensional_query' filter_dimension(dq, name = NULL, ...)
filter_dimension(dq, name = NULL, ...) ## S3 method for class 'dimensional_query' filter_dimension(dq, name = NULL, ...)
dq |
A |
name |
A string, name of the dimension. |
... |
Conditions, defined in exactly the same way as in |
Conditions can be defined on any attribute of the dimension (not only on
attributes selected in the query for the dimension). The selection is made
based on the function dplyr::filter
. Conditions are defined in exactly the
same way as in that function.
A dimensional_query
object.
Other query functions:
dimensional_query()
,
run_query()
,
select_dimension()
,
select_fact()
dq <- dimensional_query(ms_mrs) |> filter_dimension(name = "when", when_happened_week <= "03") |> filter_dimension(name = "where", city == "Boston")
dq <- dimensional_query(ms_mrs) |> filter_dimension(name = "when", when_happened_week <= "03") |> filter_dimension(name = "where", city == "Boston")
geomultistar
S3 classA geomultistar
object is created. Dimensions that contain geographic
information are indicated.
geomultistar(ms = NULL, geodimension = NULL)
geomultistar(ms = NULL, geodimension = NULL)
ms |
A |
geodimension |
A vector of dimension names. |
A geomultistar
object.
Other geo functions:
define_geoattribute()
,
get_empty_geoinstances()
,
run_geoquery()
# gms <- geomultistar(ms = ms_mrs, geodimension = "where")
# gms <- geomultistar(ms = ms_mrs, geodimension = "where")
Gets the instances of the given geographic attribute that do not have a geometry associated with them.
get_empty_geoinstances(gms, dimension = NULL, attribute = NULL) ## S3 method for class 'geomultistar' get_empty_geoinstances(gms, dimension = NULL, attribute = NULL)
get_empty_geoinstances(gms, dimension = NULL, attribute = NULL) ## S3 method for class 'geomultistar' get_empty_geoinstances(gms, dimension = NULL, attribute = NULL)
gms |
A |
dimension |
A string, dimension name. |
attribute |
A string, attribute name. |
A sf
object.
Other geo functions:
define_geoattribute()
,
geomultistar()
,
run_geoquery()
gms <- geomultistar(ms = ms_mrs, geodimension = "where") |> define_geoattribute( attribute = "city", from_layer = usa_cities, by = c("city" = "city", "state" = "state") ) empty <- gms |> get_empty_geoinstances(attribute = "city")
gms <- geomultistar(ms = ms_mrs, geodimension = "where") |> define_geoattribute( attribute = "city", from_layer = usa_cities, by = c("city" = "city", "state" = "state") ) empty <- gms |> get_empty_geoinstances(attribute = "city")
Selection of data from the 2 Cities Mortality Reporting System by age group, for the first 3 weeks of 1962.
mrs_age_test
mrs_age_test
A tibble
.
The original dataset begins in 1962. For each week, in 122 US cities, mortality figures by age group and cause, considered separately, are included (i.e., the combination of age group and cause is not included). In the cause, only a distinction is made between pneumonia or influenza and others.
Two additional dates have been generated, which were not present in the original dataset.
Fact age table of the Mortality Reporting System. Defined from ms_mrs
.
Foreign keys have been renamed, only a when dimension has been considered,
the type for the when dimension has been changed.
mrs_fact_age
mrs_fact_age
A tibble
.
https://CRAN.R-project.org/package=starschemar
Fact cause table of the Mortality Reporting System. Defined from ms_mrs
.
Foreign keys have been renamed, only a when dimension has been considered,
the type for the when dimension has been changed.
mrs_fact_cause
mrs_fact_cause
A tibble
.
https://CRAN.R-project.org/package=starschemar
When dimension table of the Mortality Reporting System. Defined from
ms_mrs
. The primary key has been renamed and its type has been changed. The
other attributes have also been renamed.
mrs_when
mrs_when
A tibble
.
https://CRAN.R-project.org/package=starschemar
Where dimension table of the Mortality Reporting System. Defined from
ms_mrs
. The primary key has been renamed.
mrs_where
mrs_where
A tibble
.
https://CRAN.R-project.org/package=starschemar
Who dimension table of the Mortality Reporting System. Defined from
ms_mrs
. The primary key has been renamed.
mrs_who
mrs_who
A tibble
.
https://CRAN.R-project.org/package=starschemar
Multistar for the Mortality Reporting System considering age and cause classification.
ms_mrs
ms_mrs
A multistar
object.
https://CRAN.R-project.org/package=starschemar
Multistar for the Mortality Reporting System considering age and cause classification data test.
ms_mrs_test
ms_mrs_test
A multistar
object.
https://CRAN.R-project.org/package=starschemar
multistar
S3 classCreates an empty multistar
object that allows you to import fact and
dimension tables.
multistar()
multistar()
A multistar
object.
Other multistar functions:
add_dimension()
,
add_facts()
,
relate_dimension()
ms <- multistar()
ms <- multistar()
multistar
as a flat tableWe can obtain a flat table, implemented using a tibble
, from a multistar
(which can be the result of a query). If it only has one fact table, it is
not necessary to provide its name.
multistar_as_flat_table(ms, fact = NULL) ## S3 method for class 'multistar' multistar_as_flat_table(ms, fact = NULL)
multistar_as_flat_table(ms, fact = NULL) ## S3 method for class 'multistar' multistar_as_flat_table(ms, fact = NULL)
ms |
A |
fact |
A string, name of the fact. |
A tibble
.
ft <- ms_mrs |> multistar_as_flat_table(fact = "mrs_age") ms <- dimensional_query(ms_mrs) |> select_dimension(name = "where", attributes = c("city", "state")) |> select_dimension(name = "when", attributes = c("when_happened_year")) |> select_fact(name = "mrs_age", measures = c("n_deaths")) |> select_fact( name = "mrs_cause", measures = c("pneumonia_and_influenza_deaths", "other_deaths") ) |> filter_dimension(name = "when", when_happened_week <= "03") |> filter_dimension(name = "where", city == "Boston") |> run_query() ft <- ms |> multistar_as_flat_table()
ft <- ms_mrs |> multistar_as_flat_table(fact = "mrs_age") ms <- dimensional_query(ms_mrs) |> select_dimension(name = "where", attributes = c("city", "state")) |> select_dimension(name = "when", attributes = c("when_happened_year")) |> select_fact(name = "mrs_age", measures = c("n_deaths")) |> select_fact( name = "mrs_cause", measures = c("pneumonia_and_influenza_deaths", "other_deaths") ) |> filter_dimension(name = "when", when_happened_week <= "03") |> filter_dimension(name = "where", city == "Boston") |> run_query() ft <- ms |> multistar_as_flat_table()
multistar
Adding a dimension to a multistar
can only relate to a fact table. You can
then relate to other fact tables in the multistar
using this function. The
name of the fact table and its foreign key must be indicated. The referential
integrity of the instances of the facts is checked.
relate_dimension(ms, dimension_name = NULL, fact_name = NULL, fact_key = NULL) ## S3 method for class 'multistar' relate_dimension(ms, dimension_name = NULL, fact_name = NULL, fact_key = NULL)
relate_dimension(ms, dimension_name = NULL, fact_name = NULL, fact_key = NULL) ## S3 method for class 'multistar' relate_dimension(ms, dimension_name = NULL, fact_name = NULL, fact_key = NULL)
ms |
A |
dimension_name |
A string, name of dimension table. |
fact_name |
A string, name of fact table. |
fact_key |
A string, name of the dimension foreign key. |
A multistar
.
Other multistar functions:
add_dimension()
,
add_facts()
,
multistar()
ms <- multistar() |> add_facts( fact_name = "mrs_age", fact_table = mrs_fact_age, measures = "n_deaths", nrow_agg = "count" ) |> add_facts( fact_name = "mrs_cause", fact_table = mrs_fact_cause, measures = c("pneumonia_and_influenza_deaths", "other_deaths"), nrow_agg = "nrow_agg" ) |> add_dimension( dimension_name = "where", dimension_table = mrs_where, dimension_key = "where_pk", fact_name = "mrs_age", fact_key = "where_fk" ) |> add_dimension( dimension_name = "when", dimension_table = mrs_when, dimension_key = "when_pk", fact_name = "mrs_age", fact_key = "when_fk", key_as_data = TRUE ) |> add_dimension( dimension_name = "who", dimension_table = mrs_who, dimension_key = "who_pk", fact_name = "mrs_age", fact_key = "who_fk" ) |> relate_dimension(dimension_name = "where", fact_name = "mrs_cause", fact_key = "where_fk") |> relate_dimension(dimension_name = "when", fact_name = "mrs_cause", fact_key = "when_fk")
ms <- multistar() |> add_facts( fact_name = "mrs_age", fact_table = mrs_fact_age, measures = "n_deaths", nrow_agg = "count" ) |> add_facts( fact_name = "mrs_cause", fact_table = mrs_fact_cause, measures = c("pneumonia_and_influenza_deaths", "other_deaths"), nrow_agg = "nrow_agg" ) |> add_dimension( dimension_name = "where", dimension_table = mrs_where, dimension_key = "where_pk", fact_name = "mrs_age", fact_key = "where_fk" ) |> add_dimension( dimension_name = "when", dimension_table = mrs_when, dimension_key = "when_pk", fact_name = "mrs_age", fact_key = "when_fk", key_as_data = TRUE ) |> add_dimension( dimension_name = "who", dimension_table = mrs_who, dimension_key = "who_pk", fact_name = "mrs_age", fact_key = "who_fk" ) |> relate_dimension(dimension_name = "where", fact_name = "mrs_cause", fact_key = "where_fk") |> relate_dimension(dimension_name = "when", fact_name = "mrs_cause", fact_key = "when_fk")
After defining a query and geographic dimensions, run the query and select the geographic data associated with it to get a geographic data layer as the result.
run_geoquery( dq, unify_by_grain = TRUE, fact = NULL, dimension = NULL, attribute = NULL, wider = FALSE ) ## S3 method for class 'dimensional_query' run_geoquery( dq, unify_by_grain = TRUE, fact = NULL, dimension = NULL, attribute = NULL, wider = FALSE )
run_geoquery( dq, unify_by_grain = TRUE, fact = NULL, dimension = NULL, attribute = NULL, wider = FALSE ) ## S3 method for class 'dimensional_query' run_geoquery( dq, unify_by_grain = TRUE, fact = NULL, dimension = NULL, attribute = NULL, wider = FALSE )
dq |
A |
unify_by_grain |
A boolean, unify facts with the same grain. |
fact |
A string, name of the fact. |
dimension |
A string, name of the geographic dimension. |
attribute |
A string, name of the geographic attribute to consider. |
wider |
A boolean, avoid repeating geographic data. |
In the case of having several fact tables, as an option, we can indicate if we do not want to unify the facts in the case of having the same grain.
If the result only has one fact table, it is not necessary to provide its name. Nor is it necessary to indicate the name of the geographic dimension if there is only one available.
If no attribute is specified, the geographic attribute of the result with finer granularity is selected.
In geographic layers, geographic objects are not repeated. The tables are
wide: for each object the rest of the attributes are defined as columns. By
means of the parameter wider
we can indicate that we want a result of this
type.
A sf
object.
Other geo functions:
define_geoattribute()
,
geomultistar()
,
get_empty_geoinstances()
gms <- geomultistar(ms = ms_mrs, geodimension = "where") |> define_geoattribute( attribute = "city", from_layer = usa_cities, by = c("city" = "city", "state" = "state") ) |> define_geoattribute( attribute = "state", from_layer = usa_states, by = c("state" = "state") ) |> define_geoattribute(attribute = "region", from_attribute = "state") |> define_geoattribute(attribute = "all_where", from_layer = usa_nation) gdq <- dimensional_query(gms) |> select_dimension(name = "where", attributes = c("state", "city")) |> select_dimension(name = "when", attributes = c("when_happened_year", "when_happened_week")) |> select_fact( name = "mrs_age", measures = c("n_deaths") ) |> select_fact(name = "mrs_cause", measures = c("pneumonia_and_influenza_deaths", "other_deaths")) |> filter_dimension(name = "when", when_happened_week <= "03") |> filter_dimension(name = "where", state == "MA") sf <- gdq |> run_geoquery() sfw <- gdq |> run_geoquery(wider = TRUE)
gms <- geomultistar(ms = ms_mrs, geodimension = "where") |> define_geoattribute( attribute = "city", from_layer = usa_cities, by = c("city" = "city", "state" = "state") ) |> define_geoattribute( attribute = "state", from_layer = usa_states, by = c("state" = "state") ) |> define_geoattribute(attribute = "region", from_attribute = "state") |> define_geoattribute(attribute = "all_where", from_layer = usa_nation) gdq <- dimensional_query(gms) |> select_dimension(name = "where", attributes = c("state", "city")) |> select_dimension(name = "when", attributes = c("when_happened_year", "when_happened_week")) |> select_fact( name = "mrs_age", measures = c("n_deaths") ) |> select_fact(name = "mrs_cause", measures = c("pneumonia_and_influenza_deaths", "other_deaths")) |> filter_dimension(name = "when", when_happened_week <= "03") |> filter_dimension(name = "where", state == "MA") sf <- gdq |> run_geoquery() sfw <- gdq |> run_geoquery(wider = TRUE)
Once we have selected the facts, dimensions and defined the conditions on the instances, we can execute the query to obtain the result.
run_query(dq, unify_by_grain = TRUE) ## S3 method for class 'dimensional_query' run_query(dq, unify_by_grain = TRUE)
run_query(dq, unify_by_grain = TRUE) ## S3 method for class 'dimensional_query' run_query(dq, unify_by_grain = TRUE)
dq |
A |
unify_by_grain |
A boolean, unify facts with the same grain. |
As an option, we can indicate if we do not want to unify the facts in the case of having the same grain.
A dimensional_query
object.
Other query functions:
dimensional_query()
,
filter_dimension()
,
select_dimension()
,
select_fact()
ms <- dimensional_query(ms_mrs) |> select_dimension(name = "where", attributes = c("city", "state")) |> select_dimension(name = "when", attributes = c("when_happened_year")) |> select_fact( name = "mrs_age", measures = c("n_deaths"), agg_functions = c("MAX") ) |> select_fact( name = "mrs_cause", measures = c("pneumonia_and_influenza_deaths", "other_deaths") ) |> filter_dimension(name = "when", when_happened_week <= "03") |> filter_dimension(name = "where", city == "Boston") |> run_query()
ms <- dimensional_query(ms_mrs) |> select_dimension(name = "where", attributes = c("city", "state")) |> select_dimension(name = "when", attributes = c("when_happened_year")) |> select_fact( name = "mrs_age", measures = c("n_deaths"), agg_functions = c("MAX") ) |> select_fact( name = "mrs_cause", measures = c("pneumonia_and_influenza_deaths", "other_deaths") ) |> filter_dimension(name = "when", when_happened_week <= "03") |> filter_dimension(name = "where", city == "Boston") |> run_query()
Save the result of a geoquery in a geopackage. The result can be a layer in the form of a flat table or a list consisting of a layer and a description table of the variables.
save_as_geopackage(sf, layer_name, file_name = NULL, filepath = NULL)
save_as_geopackage(sf, layer_name, file_name = NULL, filepath = NULL)
sf |
A |
layer_name |
A string. |
file_name |
A string. |
filepath |
A string. |
A tibble
or a list of tibble
objects.
gms <- geomultistar(ms = ms_mrs, geodimension = "where") |> define_geoattribute( attribute = "city", from_layer = usa_cities, by = c("city" = "city", "state" = "state") ) |> define_geoattribute( attribute = "state", from_layer = usa_states, by = c("state" = "state") ) |> define_geoattribute(attribute = "region", from_attribute = "state") |> define_geoattribute(attribute = "all_where", from_layer = usa_nation) gdq <- dimensional_query(gms) |> select_dimension(name = "where", attributes = c("state", "city")) |> select_dimension(name = "when", attributes = c("when_happened_year", "when_happened_week")) |> select_fact( name = "mrs_age", measures = c("n_deaths") ) |> select_fact(name = "mrs_cause", measures = c("pneumonia_and_influenza_deaths", "other_deaths")) |> filter_dimension(name = "when", when_happened_week <= "03") |> filter_dimension(name = "where", state == "MA") sf <- gdq |> run_geoquery(wider = TRUE) save_as_geopackage(sf, "city", filepath = tempdir())
gms <- geomultistar(ms = ms_mrs, geodimension = "where") |> define_geoattribute( attribute = "city", from_layer = usa_cities, by = c("city" = "city", "state" = "state") ) |> define_geoattribute( attribute = "state", from_layer = usa_states, by = c("state" = "state") ) |> define_geoattribute(attribute = "region", from_attribute = "state") |> define_geoattribute(attribute = "all_where", from_layer = usa_nation) gdq <- dimensional_query(gms) |> select_dimension(name = "where", attributes = c("state", "city")) |> select_dimension(name = "when", attributes = c("when_happened_year", "when_happened_week")) |> select_fact( name = "mrs_age", measures = c("n_deaths") ) |> select_fact(name = "mrs_cause", measures = c("pneumonia_and_influenza_deaths", "other_deaths")) |> filter_dimension(name = "when", when_happened_week <= "03") |> filter_dimension(name = "where", state == "MA") sf <- gdq |> run_geoquery(wider = TRUE) save_as_geopackage(sf, "city", filepath = tempdir())
To add a dimension in a dimensional_query
object, we have to define its
name and a subset of the dimension attributes. If only the name of the
dimension is indicated, it is considered that all its attributes should be
added.
select_dimension(dq, name = NULL, attributes = NULL) ## S3 method for class 'dimensional_query' select_dimension(dq, name = NULL, attributes = NULL)
select_dimension(dq, name = NULL, attributes = NULL) ## S3 method for class 'dimensional_query' select_dimension(dq, name = NULL, attributes = NULL)
dq |
A |
name |
A string, name of the dimension. |
attributes |
A vector of attribute names. |
A dimensional_query
object.
Other query functions:
dimensional_query()
,
filter_dimension()
,
run_query()
,
select_fact()
dq <- dimensional_query(ms_mrs) |> select_dimension(name = "where", attributes = c("city", "state")) |> select_dimension(name = "when")
dq <- dimensional_query(ms_mrs) |> select_dimension(name = "where", attributes = c("city", "state")) |> select_dimension(name = "when")
To define the fact to be consulted, its name is indicated, optionally, a vector of names of selected measures and another of aggregation functions are also indicated.
select_fact(dq, name = NULL, measures = NULL, agg_functions = NULL) ## S3 method for class 'dimensional_query' select_fact(dq, name = NULL, measures = NULL, agg_functions = NULL)
select_fact(dq, name = NULL, measures = NULL, agg_functions = NULL) ## S3 method for class 'dimensional_query' select_fact(dq, name = NULL, measures = NULL, agg_functions = NULL)
dq |
A |
name |
A string, name of the fact. |
measures |
A vector of measure names. |
agg_functions |
A vector of aggregation function names. If none is indicated, those defined in the fact table are considered. |
If the name of any measure is not indicated, only the one corresponding to the number of aggregated rows is included, which is always included.
If no aggregation function is included, those defined for the measures are considered.
A dimensional_query
object.
Other query functions:
dimensional_query()
,
filter_dimension()
,
run_query()
,
select_dimension()
dq <- dimensional_query(ms_mrs) |> select_fact( name = "mrs_age", measures = c("n_deaths"), agg_functions = c("MAX") ) dq <- dimensional_query(ms_mrs) |> select_fact(name = "mrs_age", measures = c("n_deaths")) dq <- dimensional_query(ms_mrs) |> select_fact(name = "mrs_age")
dq <- dimensional_query(ms_mrs) |> select_fact( name = "mrs_age", measures = c("n_deaths"), agg_functions = c("MAX") ) dq <- dimensional_query(ms_mrs) |> select_fact(name = "mrs_age", measures = c("n_deaths")) dq <- dimensional_query(ms_mrs) |> select_fact(name = "mrs_age")
Star Schema for the Mortality Reporting System considering the age classification data test.
st_mrs_age_test
st_mrs_age_test
A star_schema
object.
https://CRAN.R-project.org/package=starschemar
From the original dataset, some fields have been selected and renamed.
uk_london_boroughs
uk_london_boroughs
A sf
.
Since not so much detail is needed, the geometry has been simplified 20 m.
https://data.london.gov.uk/dataset/statistical-gis-boundary-files-london
From the original dataset, some fields have been selected and renamed, and only includes the Mortality Reporting System cities.
usa_cities
usa_cities
A sf
.
https://earthworks.stanford.edu/catalog/stanford-bx729wr3020
From the original dataset, some fields have been selected and renamed, and only includes the Mortality Reporting System counties.
usa_counties
usa_counties
A sf
.
Some counties appear with the same repeated name within the same state, they are the following: Baltimore, MD; Richmond, VA; St. Louis, MO. Since they are accessed by name (county and state), those of the same name within the state have been grouped together.
https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_county_20m.zip
From the original dataset, some fields have been selected and renamed.
usa_divisions
usa_divisions
A sf
.
https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_division_20m.zip
From the original dataset, some fields have been selected and renamed.
usa_nation
usa_nation
A sf
.
https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_nation_20m.zip
From the original dataset, some fields have been selected and renamed.
usa_regions
usa_regions
A sf
.
https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_region_20m.zip
From the original dataset, some fields have been selected and renamed, and only includes the Mortality Reporting System states.
usa_states
usa_states
A sf
.
https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_state_20m.zip