Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions doc/gallery/examples/minard.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -52,15 +52,15 @@ To explain what we have done here:
* `lat AS y` sets the `lat` (latitude) column as the `y` aesthetic.
* `DRAW line` instructs to plot to use the `line` layer.

No celebrated military strategist would plan his troup movements towards Moscow in this fashion though.
No celebrated military strategist would plan his troop movements towards Moscow in this fashion though.
The chart only shows movement in the west-east direction, meaning that we are not capturing the retreat properly.

## Correcting mistakes

The first 'mistake' we made is chosing the `line` layer.
The first 'mistake' we made is choosing the `line` layer.
Line layers automatically sort along the axis, so we're mixing coordinates from the advance and the retreat.
To rectify this, we should use the `path` layer instead.
Path layers connect datapoints in the order they appear in, so we're no long sorting along west-east.
Path layers connect datapoints in the order they appear in, so we're no longer sorting along west-east.

```{ggsql}
VISUALISE long AS x, lat AS y FROM 'minard_troops.csv'
Expand Down
2 changes: 1 addition & 1 deletion doc/syntax/clause/draw.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ Often the implicit grouping from the aesthetic mapping is enough, e.g. mapping a
ORDER BY <column>, ...
```

For some layers the order of records in the data is important, e.g. the path layer which connect records in the order they are provided. Since databases often doesn't guarantee a specific order of the data, the `ORDER BY` clause can be used to enforce such and order. Even for layers where the order doesn't immediately seem to matter it may have an effect, e.g. an overplottet scatterplot where the records in the end of the data are plottet on top of the one in the start.
For some layers the order of records in the data is important, e.g. the path layer which connect records in the order they are provided. Since databases often don't guarantee a specific order of the data, the `ORDER BY` clause can be used to enforce such an order. Even for layers where the order doesn't immediately seem to matter it may have an effect, e.g. an overplotted scatterplot where the records in the end of the data are plotted on top of the one in the start.

## Layer orientation
Some layer types treat the two axes differently. For instance, a boxplot has categories along a discrete axis and summary statistics along a continuous one. While we are used to seeing boxplots with categories along the x-axis, this is not a necessity. The orientation can be deduced directly from the mappings in the layer. So, if you map discrete data to the x axis and continuous data to the y axis you get a boxplot in the standard orientation, whereas if you switch the mapping the boxes will "lay down" instead. The vast majority of layers that have an orientation also have a unique mapping pattern that allows us to deduce the orientation directly from the mapping. The few layers where the mapping is ambiguous (e.g. `line`) have an `orientation` setting that allows you to set the orientation explicitly.
2 changes: 1 addition & 1 deletion doc/syntax/clause/place.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ The `SETTING` clause can be used for two different things:

* *Setting aesthetics*: All aesthetics in `PLACE` layers are specified using literal value, e.g. 'red' (as in the color red).
Aesthetics that are set will not go through a scale but will use the provided value as-is.
You cannot set an aesthetic to a column, only to a literal values.
You cannot set an aesthetic to a column, only to a literal value.
Contrary to `DRAW` layers, `PLACE` layers can take multiple literal values in an array.
* *Setting parameters*: Some layers take additional arguments that control how they behave.
Often, but not always, these modify the statistical transformation in some way.
Expand Down
2 changes: 1 addition & 1 deletion doc/syntax/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ ggsql augments the standard SQL syntax with a number of new clauses to describe
- [`VISUALISE`](clause/visualise.qmd) initiates the visualisation part of the query
- [`DRAW`](clause/draw.qmd) adds a new layer to the visualisation
- [`PLACE`](clause/place.qmd) adds an annotation layer
- [`SCALE`](clause/scale.qmd) specify how an aesthetic should be scaled
- [`SCALE`](clause/scale.qmd) specifies how an aesthetic should be scaled
- [`FACET`](clause/facet.qmd) describes how data should be split into small multiples
- [`PROJECT`](clause/project.qmd) is used for selecting the coordinate system to use
- [`LABEL`](clause/label.qmd) is used to manually add titles to the plot or the various axes and legends
Expand Down
2 changes: 1 addition & 1 deletion doc/syntax/layer/type/area.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ DRAW area
SETTING position => 'identity', opacity => 0.5
```

Whith the default `position => 'stack'` we can normalise the total so that each stack totals to the same value. These only make sense if every series is measured in the same absolute unit. (Wind and temperature have different units and the temperature is not absolute.)
With the default `position => 'stack'` we can normalise the total so that each stack totals to the same value. These only make sense if every series is measured in the same absolute unit. (Wind and temperature have different units and the temperature is not absolute.)

```{ggsql}
VISUALISE Date AS x, Value AS y, Series AS colour FROM long_airquality
Expand Down
2 changes: 1 addition & 1 deletion doc/syntax/layer/type/histogram.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ title: "Histogram"
Visualise the distribution of a single continuous variable by dividing the primary axis into bins and counting the number of observations in each bin. If providing a weight then a weighted histogram is calculated instead.

## Aesthetics
The following aesthetics are recognised by the bar layer.
The following aesthetics are recognised by the histogram layer.

### Required
* Primary axis (e.g. `x`): The continuous variable to bin
Expand Down
2 changes: 1 addition & 1 deletion doc/syntax/layer/type/rule.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ DRAW rule
MAPPING value AS y, label AS colour FROM thresholds
```

Add a diagnoal reference line to a scatterplot by using `slope`
Add a diagonal reference line to a scatterplot by using `slope`

```{ggsql}
VISUALISE FROM ggsql:penguins
Expand Down
2 changes: 1 addition & 1 deletion doc/syntax/layer/type/smooth.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ $$
### Total least squares

The `method => 'tls'` setting uses total least squares to compute the intercept $a$ and slope $b$ of a straight line.
The method minimizes the 2-dimensiontal distance between a point and the perpendicular projection of that point on the line.
The method minimizes the 2-dimensional distance between a point and the perpendicular projection of that point on the line.
Minimising the perpendicular distances (rather than just the vertical distances) makes sense if there is uncertainty or measurement error in not just $y$, but in $x$ as well.
In such case, it is a more accurate depiction of the relationship between $x$ and $y$, but it isn't the best predictor of $y$ given $x$.

Expand Down
6 changes: 3 additions & 3 deletions doc/syntax/layer/type/text.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ The following aesthetics are recognised by the text layer.
### Required
* Primary axis (e.g. `x`): Position along the primary axis.
* Secondary axis (e.g. `y`): Position along the secondary axis.
* `label` The text to dislay.
* `label` The text to display.

### Optional
* `stroke` The colour at the contour lines of glyphs. Typically kept blank.
* `stroke` The colour of the contour lines of glyphs. Typically kept blank.
* `fill` The colour of the glyphs.
* `colour` Shorthand for setting `stroke` and `fill` simultaneously.
* `opacity` The opacity of the fill colour.
Expand Down Expand Up @@ -66,7 +66,7 @@ Known formatters are:
* `x`/`X`: Unsigned hexadecimal

## Data transformation
The text layer does not transform its data but passed it through unchanged.
The text layer does not transform its data but passes it through unchanged.

## Orientation
The text layer has no orientation. The axes are treated symmetrically.
Expand Down
2 changes: 1 addition & 1 deletion doc/syntax/scale/aesthetic/0_position.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ All layers need at least one of each position aesthetic mapped in order to show
However, the layer may compute position aesthetics from the mapping.
For example, a bar plot calculates the `y` aesthetic by counting the number of records in each group defined by the mapped `x` aesthetic.

In the above we use `x`and `y` as examples of position aesthetics, but in reality the position aesthetic names are defined by the coordinate system in use.
In the above we use `x` and `y` as examples of position aesthetics, but in reality the position aesthetic names are defined by the coordinate system in use.
A Cartesian coordinate system recognizes `x` and `y` whereas a polar coordinate system recognizes `radius` and `angle`.
You can implicitly choose the coordinate system by mapping to the aesthetics that it uses, i.e. if you map to `radius` and `angle` then a polar coordinate system will be chosen for you.

Expand Down
1 change: 0 additions & 1 deletion doc/syntax/scale/aesthetic/linetype.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ Valid hex patterns:
ggsql provides two linetype palettes which are generally enough for every need

The `categorical` palette is the default palette for discrete linetype scales. It consists of the 6 named patterns [shown above](#literal-values) in the same order. Since it is the only palette for discrete linetypes, there is rarely a need to specify it.
The `categorical` palette is the default palette for discrete linetype scales. It consists of the 6 named patterns [shown above](#literal-values) in the same order. Since it is the only palette for discrete linetypes there is rarely a need to specify it.

### Sequential palette
The `sequential` palette is the default for binned and ordinal linetype scales. It consists of up to 15 patterns with increasing amount of "on" and decreasing amount of "off". This creates a visual progression from sparse (low ink) to solid (100% ink).
Expand Down
2 changes: 1 addition & 1 deletion doc/syntax/scale/aesthetic/shape.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ SCALE shape TO ('star', 'bowtie', 'square-plus')
### Default palette (`closed`)
The default palette contains the 9 closed shapes (first nine in the table above). This is the recommended palette for most use cases, as closed shapes are more visually prominent and easier to distinguish at small sizes.

While the closed shapes are most often used filled, you can also turn if fill and only draw the stroke for a lighter look.
While the closed shapes are most often used filled, you can also turn off fill and only draw the stroke for a lighter look.

### Open palette (`open`)
The `open` palette contains the last 6 shapes in the table. None of these have a fill. You may use this palette when you want transparent shapes that don't obscure data.
Expand Down
24 changes: 12 additions & 12 deletions doc/syntax/scale/type/binned.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@ title: Binned

> Scales are declared with the [`SCALE` clause](../../clause/scale.qmd). Read the documentation for this clause for a thorough description of its syntax.

The binned scale type maps continuous data types into a discrete output domain. It can either be used to bin continuous data for layers that needs a discrete scale, e.g. the [bar layer](../../layer/type/bar.qmd), or to discretize a continuous output range to make clearer visual separation between the groups. Lastly, while generally not advised, it can also be used to map continuous data to an aesthetic that is otherwise only meaningful for discrete data (e.g. `shape`).
The binned scale type maps continuous data types into a discrete output domain. It can either be used to bin continuous data for layers that needs a discrete scale, e.g. the [bar layer](../../layer/type/bar.qmd), or to discretize a continuous output range to make clearer visual separation between the groups. Lastly, while generally not advised, it can also be used to map continuous data to an aesthetic that is otherwise only meaningful for discrete data (e.g. `shape`).

The binned scale is never chosen automatically so it must be selected explicitly if needed using `SCALE BINNED ...`

## Input range
The input range for binned scales are defined by their minimum and maximum values. These can be given explicitly or deduced from the mapped data. If `FROM` is omitted then the range will be given as the minimum and maximum break values, whether provided directly or calculated. If provided as an array of length 2 then the first element will set the minimum and the second element will set the maximum. If either of these elements are `null` then that part of the range will be deduced from the data. As an example `SCALE BINNED x FROM (0, null)` will set the minimum part of the range to 0 and the maximum part to the maximal value of the mapped data. However, if neither input range nor explicit breaks are provided then the input range will be modified so that the calculated bins are even sized and include all data. This means that the range in most cases will expand past the minimum and maximum data values.

Position aesthetics (`x` and `y`) will have their range expanded based on the `expand` setting.
Position aesthetics (`x` and `y`) will have their range expanded based on the `expand` setting.
If values in the mapped data falls outside of the input domain the values will be changed based on the `oob` setting.

The input range is converted to the type defined by the transform. This means that a time range can both be given as a `%H:%M:%S` string or as a numeric giving the number of nanoseconds since midnight.
Expand Down Expand Up @@ -42,7 +42,7 @@ The output range can either be given as an array of values or a named palette. F
All aesthetics have a default output range so it is never required to provide one unless you want to change from the default. The defaults are as follows:

* `x`/`y`: Ignored (values used directly)
* `stroke`/`fill`: The `navia` palette
* `stroke`/`fill`: The `ggsql` palette
* `size`/`linewidth`: `(1, 6)` (points)
* `opacity`: `(0.1, 1.0)` (0 being fully transparent and 1 being fully opaque)
* `linetype`: The `sequential` palette
Expand All @@ -52,15 +52,15 @@ While it is possible to use a binned scale to map continuous data to linetype an

### Examples

#### Select a continuous color palette
#### Select a continuous color palette
```{ggsql}
VISUALISE bill_len AS x, bill_dep AS y, body_mass AS color FROM ggsql:penguins
DRAW point
SCALE BINNED color TO viridis
```

## Transform
The transform of the scale both defines how the input data is parsed as well as any mathematical transform applied before it is mapped to the output range. The default transform is deduced from a combination of the mapped data and the aesthetic the scale is applied to.
The transform of the scale both defines how the input data is parsed as well as any mathematical transform applied before it is mapped to the output range. The default transform is deduced from a combination of the mapped data and the aesthetic the scale is applied to.

* `linear`: The default transform unless stated otherwise. Creates a linear mapping between the input and output range.
* `log`/`log2`/`ln`: Creates a mapping between the logarithm of the input to the output range.
Expand All @@ -81,15 +81,15 @@ Since breaks are not just presentational as it is with continuous scales the cho

* `linear`:
- `pretty => true`: Will use Wilkinsons Extended algorithm to attempt to find nice breaks in the given interval close to the number of breaks requested
- `pretty => false`: Will produce the requested number of evenly spaced breaks within the scale range
* `log`/`log2`/`ln`:
- `pretty => false`: Will produce the requested number of evenly spaced breaks within the scale range
* `log`/`log2`/`ln`:
- `pretty => true`: Will use the 1-2-5 pattern and thin down to approximately the requested number of breaks
- `pretty => false`: Breaks will be exclusively at the power of the base (e.g. 1, 10, 100, 1000 for log10)
* `exp10`/`exp2`/`exp`: Same logic as the log breaks but in the inverse direction
* `sqrt`/`square`: Like `linear` but the range is first converted to sqrt space and the breaks are then converted back
* `asinh`/`pseudo_log`/`pseudo_log2`/`pseudo_ln`: Like `log` but includes zero and negates the breaks for the negative part
* `integer`: Like `linear` except disallowing breaks at fractional parts
* `date`/`datetime`/`time`:
* `date`/`datetime`/`time`:
- `breaks => <interval>`: If breaks are given as an interval (e.g. `week`, `30 seconds` or `5 years`) then the breaks will get that spacing aligned at the interval boundary (Jan 1 for years, etc). This ignores the `pretty` setting
- `pretty => true`: An appropriate interval is chosen that approximates the requested number of breaks and then used as above
- `pretty => false`: Linear spacing in integer space as close to the requested number of breaks
Expand Down Expand Up @@ -140,16 +140,16 @@ The following settings are recognised by binned scales:
VISUALISE body_mass AS x FROM ggsql:penguins
DRAW bar
SCALE BINNED x
SETTING
oob => 'squish',
SETTING
oob => 'squish',
breaks => (4000, 4250, 4500, 4750, 5000, 5250, 5500)
```

## Renaming
Breaks are generally named by their value. However, you may wish to rename one, several, or all of these. The `RENAMING` clause allows you to do that both by directly renaming a specific break or by providing a formatting function.

### Direct renaming
When you provide a break value on the left and a break exist at that value then it will take on the label specified on the right. For examples adding `RENAMING 0 => 'Nil'` will ensure that if there is a break at 0 it will appear as "Nil" on the legend/axis
When you provide a break value on the left and a break exists at that value then it will take on the label specified on the right. For example adding `RENAMING 0 => 'Nil'` will ensure that if there is a break at 0 it will appear as "Nil" on the legend/axis

### Label formatting
Besides direct renaming you can also provide a formatting string if you want the same to happen to all labels, e.g. add a prefix or suffix. The syntax for this is `RENAMING * => '... {} ...'`. The current label will be inserted into the `{}` to produce the new label. Besides simply inserting the break value into the string, we can also provide a formatter. Of special interest to binned scales are the `:time` and `:num` formatters which lets you control how temporal and numeric values are presented. You can read more about these formatters in [the break formatting section of the `SCALE` documentation](../../clause/scale.qmd#break-formatting)
Expand Down Expand Up @@ -177,7 +177,7 @@ SCALE BINNED x
```{ggsql}
VISUALISE bill_dep AS x FROM ggsql:penguins
DRAW bar
SCALE BINNED x
SCALE BINNED x
RENAMING * => '{} mm'
```

Expand Down
2 changes: 1 addition & 1 deletion doc/syntax/scale/type/continuous.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ SCALE x
Breaks are generally named by their value. However, you may wish to rename one, several, or all of these. The `RENAMING` clause allows you to do that both by directly renaming a specific break or by providing a formatting function.

### Direct renaming
When you provide a break value on the left and a break exist at that value then it will take on the label specified on the right. For examples adding `RENAMING 0 => 'Nil'` will ensure that if there is a break at 0 it will appear as "Nil" on the legend/axis
When you provide a break value on the left and a break exists at that value then it will take on the label specified on the right. For example adding `RENAMING 0 => 'Nil'` will ensure that if there is a break at 0 it will appear as "Nil" on the legend/axis

### Label formatting
Besides direct renaming you can also provide a formatting string if you want the same to happen to all labels, e.g. add a prefix or suffix. The syntax for this is `RENAMING * => '... {} ...'`. The current label will be inserted into the `{}` to produce the new label. Besides simply inserting the break value into the string, we can also provide a formatter. Of special interest to continuous scales are the `:time` and `:num` formatters which lets you control how temporal and numeric values are presented. You can read more about these formatters in [the break formatting section of the `SCALE` documentation](../../clause/scale.qmd#break-formatting)
Expand Down
Loading
Loading