* CreateConstraints.sql is not run on mirrors.
* The original inline constraints *were* created on mirrors, so the `DROP`
statements remain in a `*-mirror.sql` script.
This was added in 2007 and AFAICT never used for anything.
All our relationship types had the default 0 value except for one,
which had 1 (I recently changed that to 0 as well)...
but that changed nothing since we do not use the value in any way.
I cannot drop this from the Dict in AddLinkType / EditLinkType
since it fails to load old edits without it, so I left it in
but added a comment explaining we no longer enter it.
For some reason label code was being restricted with a constraint
added during table generation, meaning even mirrors have it.
This is really annoying now that the org generating label codes
has started generating 6-digit ones, which we block at the
db level, turning an otherwise easy fix into a schema change.
Additionally, we don't name the constraint so postgres does automatically;
it seems the name postgres chooses is label_label_code_check.
This moves the constraint to CreateConstraints, where it won't
be added to all mirrors (if we need to change it again in the future
it will be a lot easier), and it of course amends it to actually allow
6-digit label codes.
The query in the edit search that is supposed to make use of this index was
changed in b21e5c579d0201325f5fbe385a51b663ae6caa69, but the actual index was
never updated to match (so it could not be used).
Note: This was already fixed in production. This patch provides an upgrade
script to run during the next schema change.
While originally I was just planning to make changes
to the edit note table and nothing else, discussion with the team
suggested that it would be better to actually have a table
that stores the changes. This gives us several benefits:
1) we can easily check if an edit note has been deleted or edited
2) we can see who and when made the change
3) we can store the reason the change was made
4) we can see the original value (useful in case a malicious actor
gets access to an admin account somehow, and also useful
to find proof of misconduct if the editor posts an edit note,
then removes it (after it has been emailed to the intended insultee).
This table won't be replicated nor dumped
(other than in private dumps).
A release marked as cancelled by definition does not have a release date
and as such should not be used when calculating the first release date
of recordings or release groups connected to it.
This amends the get_release_first_release_date_rows function
to ignore releases with status 6 (cancelled).
It also updates the a_upd_release function to recalculate first dates
when a status is changed to or from 6 (cancelled).
This was apparently used ages ago to store
"a previous algorithm (less accurate) or things imported from freedb
that didn't have the accuracy of our ids", said Rob,
but "we may have dropped all those back in the day
and never nuked the column".
It contains always FALSE, at the moment,
and we seem to use it exactly nowhere.
This is pre-NGS code we aren't planning to get working again.
The feature is probably eventually coming to ListenBrainz instead.
This commit removes all the SQL schema for this feature.
This probably isn't as impactful as the s/mirror/all/ change, since master is a
subset of standalone, but I think it's still slightly clearer than only
specifying standalone.
Ran `jq . upgrade.json > upgrade2.json && mv upgrade2.json upgrade.json`.
The previous formatting was inconsistent and depended too much on the keys'
lengths to align items, which are changing with MBS-12370.
We automatically remove empty ACs after the last usage
is removed, using the dec_ref_count function.
The artist_credit_gid_redirect table has FKs to artist_credit though,
and it does not affect this process in any way.
That means that when an AC with only 1 usage has a redirect,
any edit that would remove it is blocked because
the redirect is not being removed and the associated FK still exists.
I talked with mwiencek and we decided to implement an intermediate
table where rows can be entered when the ref_count gets to 0,
rather than immediately deleting them. We then wait 7 days to see
if the artist credit got reused (ref_count went above 0 again)
in the meantime. This avoids losing redirects from an AC
that would get used again soon after, for example because it is being
removed from one recording and then added to another.
After 7 days the artist credit gets removed, including any redirects.
If it did get reused in the meantime, we just drop it from the
temporary table, obviously without removing the AC.
The temporary table setup should also work with the alternative
tracklist code in the future, which is why the table is generic
and stores a table name rather than being specifically AC-related.
The dbmirror2 schema and associated objects should be created on
master/mirror nodes, not standalone. However, the schema on its own
should still exist on standalone nodes (as it's created unconditionally
by InitDb.pl.)
Three new trigger functions are added:
* update_tag_counts_for_raw_insert
* update_tag_counts_for_raw_update
* update_tag_counts_for_raw_delete
These are executed for insertions/updates/deletions on the raw tag
tables, and update both the aggregate vote counts (area_tag.count, etc.)
and tag.ref_count.
Since these functions are written to work for any entity type, the
tagged entity type must be passed in as an argument to the function
where the trigger is invoked; a new ENUM, taggable_entity_type, has been
added to check that the argument is trusted.
The upgrade script also resolves MBS-5359 by rebuilding all the counts.
Existing Perl code that managed these counts has been removed from
Data::EntityTag.
For tests, see t/pgtap/tag_counts.sql.
The link table and associated link_attribute tables mainly exist for
attribute caching: a lot of relationships happen to share the same link
type and attributes, so we can cache such data by a common ID and
minimize the amount of data that needs to be loaded across many
relationships.
Because they are used for caching and are shared across many (otherwise
unrelated) relationships, they are intended to be immutable: they should
only be inserted or deleted.
Updates to these tables should thus be blocked at the schema level.
This also helps ensure the data integrity of future materialized tables
that will rely on their immutability, like area_containment.
This patch contains the schema changes required for PostgreSQL 14 to
work. Note that these changes are also compatible with PostgreSQL 12.
For context, refer to the first incompatibility listed under
https://www.postgresql.org/docs/release/14.0/.
Because anycompatiblearray doesn't exist in v12, the ideal solution (and
the one used here) is to use more specific types.
* The _median function is only used by the median aggregate, which is
only used to calculate median recording lengths -- an integer value.
* The array_accum aggregate was added in 33bd14e for tracklist_index, a
table which was removed years ago. The aggregate is unused and can
be removed too.
* array_cat_agg is only used internally by dbmirror. (See
https://github.com/metabrainz/dbmirror/blob/ca536c7/pending.c#L405.)
Since pg_constraint.conkey is of type int2[], we can use that. (See
https://www.postgresql.org/docs/current/catalog-pg-constraint.html .)
This was added back when we thought we would have different
attributes to store item numbers for part of series relationships.
We don't, so this is always one and the same, 788 (number).
This changes the views to hardcode to 788 and removes the
ordering_attribute column elsewhere. It keeps the
SERIES_ORDERING_ATTRIBUTE constant on JS since that seems to be
used in fields and just points to the 'number' attribute,
but removes it on Perl because it was only used to store
the ID under ordering_attribute and never use it again, AFAICT.
Withdrawn releases were by definition official until withdrawn,
so if a release group has only withdrawn releases, we should
probably still show it (this seems to be the majority community opinion,
anyway).
I understand two comparisons might be a lot more efficient
than any(1, 5), so doing that.
We need to run a one-off script to update RGs where a release was
*already* set to withdrawn, since otherwise these won't be set
to official(ish) until they are touched for some other reason.
Added the script to the upgrade file.
For some reason (probably because it felt like overkill) we originally
implemented the genre_alias table without support
for alias types. That seems problematic though,
because there's at least the usual case of "typo" and
"name in a specific language", so "search hint" vs "genre name".
This adds types, and more generally changes
the genre_alias table to be consistent with the alias tables
for other entities.
To ensure the column order is also consistent,
we drop the old table and re-enter the old data
into a newly created one.
The genre_alias table should be empty in prod and any slaves,
but to be safe (in case some standalone users are using it)
this first copies any data to a temporary table and then
re-inserts it once done.
While we don't usually want titles and names to have blocks
of more than one space, that can be useful in cases
where the multiple spaces are in fact artist intent.
Additionally, this constraint is currently broken
by several entries in the DB.
For now, this is still enforced by Perl. Further changes will be needed
there if we want to allow it in some cases (e.g. track titles),
and we probably want to keep blocking it in others (e.g. tag names)
We have been marking collection subscriptions as not available,
rather than removing them, whenever a collection is made private.
This takes what seems like the obvious next step and restores them
when the collection is made public again.
If you are able to participate in a private collection, you should also
be able to subscribe to it. Of course, if you're then removed as a
collaborator (thus losing the right to see the collection)
then you should also be removed as a subscriber.
This requires a schema change for the del_collection_sub_on_private
function, since otherwise the collection subscription will be
deactivated for collaborators if the collection was public
and is then made private.
Quoting from mwiencek's ticket directly since it explains
the whole reasoning:
"These triggers ensure that if an alias is set
to primary_for_locale = true, we unset primary_for_locale
on all other aliases for that locale & entity to avoid
uniqueness violations on the *_alias_idx_primary index.
There's a major problem here: cascading triggers
do not mix well with dbmirror, and can break replication.
This is because when row-level AFTER triggers are cascaded,
the innermost recordchange trigger is run first
(inverse of the actual statement order).
This is exactly what caused MBS-9366, and is a very hard problem
to fix in dbmirror properly; it's much safer to outright
ban cascading updates on replicated tables.
Even ignoring the replication issue, we don't need these triggers.
Avoiding duplicate primary locales can be handled
by the application (and already is). If it wasn't,
it even seems better to have the application fail
with a unique index violation (since that indicates it's not
accounting the changes properly) than silently change the data."
For some reason, we did not have delete_unused_tag triggers
for every taggable entity type. This was causing tags deleted
from the missing entity types to remain in the DB with 0 uses.
Avoid re-calculating `*first_release_date` data if the release event
didn't actually change.
This can happen if the trigger fired on the `release_country` table and
only the country changed. (Rather than checking for that, it's still
correct to test the data we're concerned about: `UPDATE` triggers fire
even when no data changed.)
Additionally, make sure we're only calling
`set_release_first_release_date` once if the release column is
unchanged. (This would occur on merges, and probably no other time.)