Mario e709f8ac70 [rhythm] Introduce block-builder and kafka ingest path (#4533)
* Block-builder PoC

* Add unit test for block-builder (#4289)

* Add unit test for block-builder

* fmt

* Update tests

* cmon

* Deterministically build blocks for partition sections (#4327)

* Pull main (#4342)

* chore: remove gofakeit dependency (#4274)

* Further reduce Labes() calls in the metrics registry (#4283)

* Respect passed headers in read path requests (#4287)

* Ingester: Validate completed blocks (#4256)

* Add validate method to block

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Add Validate usage in the ingester

Signed-off-by: Joe Elliott <number101010@gmail.com>

* changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* add test and fix replay

Signed-off-by: Joe Elliott <number101010@gmail.com>

* increment metric

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Add `invalid_utf8` to reasons spans could be rejected (#4293)

* Add `invalid_utf8` to reasons spans could be rejected

* Update changelog

* Update docs

* Ensure test covers invalid UTF-8 and not slack time

* add signals for duplicate rf1 data (#4296)

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Bump anchore/sbom-action from 0.17.5 to 0.17.7 (#4307)

Bumps [anchore/sbom-action](https://github.com/anchore/sbom-action) from 0.17.5 to 0.17.7.
- [Release notes](https://github.com/anchore/sbom-action/releases)
- [Changelog](https://github.com/anchore/sbom-action/blob/main/RELEASE.md)
- [Commits](https://github.com/anchore/sbom-action/compare/v0.17.5...v0.17.7)

---
updated-dependencies:
- dependency-name: anchore/sbom-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* docs: Update readme with explore traces info (#4263)

* docs: Update readme with explore traces info

Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* chore: remove spanlogger (#4312)

* chore: remove spanlogger

* Query-Frontend: Add middleware to drop headers (#4298)

* header strip ware

Signed-off-by: Joe Elliott <number101010@gmail.com>

* comment

Signed-off-by: Joe Elliott <number101010@gmail.com>

* changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* remove header strip wear from metrics summary

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Increase length of time compactions have to fail (#4315)

* increase length of time compactions have to fail

Signed-off-by: Joe Elliott <number101010@gmail.com>

* gen

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>

* docs: mark serverless as deprecated (#4017)

* docs: mark serverless as deprecated

* Changelog + readme

* docs: Remove duplicated examples (#4295)

This removes duplicates examples from the Configure TraceQL
metrics page.

Signed-off-by: Alex Bikfalvi <alex.bikfalvi@grafana.com>

* tempo-cli: support dropping multiple traces in a single operation (#4266)

* tempo-cli: support dropping multiple traces in a single operation

* update final log message

---------

Co-authored-by: Suraj Nath <9503187+electron0zero@users.noreply.github.com>

* [DOC] Add clarification for metrics summary and traceQL metrics (#4316)

* Add clarification for metrics summary and traceQL metrics

* Apply suggestions from code review

Co-authored-by: Jennifer Villa <jvilla2013@gmail.com>

* Update docs/sources/tempo/api_docs/metrics-summary.md

---------

Co-authored-by: Jennifer Villa <jvilla2013@gmail.com>

* TraceQL metrics time range fixes (#4325)

* Disconnect job time range filtering from step, so that results in split backend/recent range is accurate

* changelog

* Fix to assert metrics query range before alignment because alignment may increase it, which is not the responsibility of the caller to account for (#4331)

* Add doc about configuring TLS with Helm (#4328)

* Add doc about configuring TLS with Helm

* Add memberlist and readinessProbe to example

* Include server config for listening on TLS

* Add note about scraping

* Update docs/sources/tempo/configuration/network/tls.md

Co-authored-by: Markus Toivonen <markus.toivonen@hoxhunt.com>

* Update docs/sources/tempo/configuration/network/tls.md

Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* Update docs/sources/tempo/configuration/network/tls.md

Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* Add memcached config for TLS

---------

Co-authored-by: Markus Toivonen <markus.toivonen@hoxhunt.com>
Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* [DOC] Add TLS info to Helm chart doc (#4334)

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Alex Bikfalvi <alex.bikfalvi@grafana.com>
Co-authored-by: Javier Molina Reyes <javiermolinar@live.com>
Co-authored-by: Zach Leslie <zach.leslie@grafana.com>
Co-authored-by: Joe Elliott <number101010@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ryan Perry <Rperry2174@gmail.com>
Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>
Co-authored-by: Suraj Nath <9503187+electron0zero@users.noreply.github.com>
Co-authored-by: Alex Bikfalvi <alex@bikfalvi.com>
Co-authored-by: Andrey Karpov <ndk@users.noreply.github.com>
Co-authored-by: Jennifer Villa <jvilla2013@gmail.com>
Co-authored-by: Martin Disibio <martin.disibio@grafana.com>
Co-authored-by: Markus Toivonen <markus.toivonen@hoxhunt.com>

* WIP: Rhythm ingest path (#4314)

* Validate distributor config. Finish encoder/decoder tests

* Repair tests

* Make SingleBinary work out of the box by defaulting to partition 0

* Fix first time startup where blockbuilder fails before ingester can create topic

* Fix initial startup cycle time and delay

* Add more failure modes to the block-builder (#4345)

* Add more tests to the block-builder

* stuff

* Add comments

* [Rhythm] Metrics generator read from kafka first pass (#4359)

* Metrics generator read from kafka first pass

* review feedback

* Multiple fixes in block-builder (#4364)

* [rhythm] git merge origin/main (#4376)

* chore: remove gofakeit dependency (#4274)

* Further reduce Labes() calls in the metrics registry (#4283)

* Respect passed headers in read path requests (#4287)

* Ingester: Validate completed blocks (#4256)

* Add validate method to block

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Add Validate usage in the ingester

Signed-off-by: Joe Elliott <number101010@gmail.com>

* changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* add test and fix replay

Signed-off-by: Joe Elliott <number101010@gmail.com>

* increment metric

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Add `invalid_utf8` to reasons spans could be rejected (#4293)

* Add `invalid_utf8` to reasons spans could be rejected

* Update changelog

* Update docs

* Ensure test covers invalid UTF-8 and not slack time

* add signals for duplicate rf1 data (#4296)

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Bump anchore/sbom-action from 0.17.5 to 0.17.7 (#4307)

Bumps [anchore/sbom-action](https://github.com/anchore/sbom-action) from 0.17.5 to 0.17.7.
- [Release notes](https://github.com/anchore/sbom-action/releases)
- [Changelog](https://github.com/anchore/sbom-action/blob/main/RELEASE.md)
- [Commits](https://github.com/anchore/sbom-action/compare/v0.17.5...v0.17.7)

---
updated-dependencies:
- dependency-name: anchore/sbom-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* docs: Update readme with explore traces info (#4263)

* docs: Update readme with explore traces info

Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* chore: remove spanlogger (#4312)

* chore: remove spanlogger

* Query-Frontend: Add middleware to drop headers (#4298)

* header strip ware

Signed-off-by: Joe Elliott <number101010@gmail.com>

* comment

Signed-off-by: Joe Elliott <number101010@gmail.com>

* changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* remove header strip wear from metrics summary

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Increase length of time compactions have to fail (#4315)

* increase length of time compactions have to fail

Signed-off-by: Joe Elliott <number101010@gmail.com>

* gen

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>

* docs: mark serverless as deprecated (#4017)

* docs: mark serverless as deprecated

* Changelog + readme

* docs: Remove duplicated examples (#4295)

This removes duplicates examples from the Configure TraceQL
metrics page.

Signed-off-by: Alex Bikfalvi <alex.bikfalvi@grafana.com>

* tempo-cli: support dropping multiple traces in a single operation (#4266)

* tempo-cli: support dropping multiple traces in a single operation

* update final log message

---------

Co-authored-by: Suraj Nath <9503187+electron0zero@users.noreply.github.com>

* [DOC] Add clarification for metrics summary and traceQL metrics (#4316)

* Add clarification for metrics summary and traceQL metrics

* Apply suggestions from code review

Co-authored-by: Jennifer Villa <jvilla2013@gmail.com>

* Update docs/sources/tempo/api_docs/metrics-summary.md

---------

Co-authored-by: Jennifer Villa <jvilla2013@gmail.com>

* TraceQL metrics time range fixes (#4325)

* Disconnect job time range filtering from step, so that results in split backend/recent range is accurate

* changelog

* Fix to assert metrics query range before alignment because alignment may increase it, which is not the responsibility of the caller to account for (#4331)

* Add doc about configuring TLS with Helm (#4328)

* Add doc about configuring TLS with Helm

* Add memberlist and readinessProbe to example

* Include server config for listening on TLS

* Add note about scraping

* Update docs/sources/tempo/configuration/network/tls.md

Co-authored-by: Markus Toivonen <markus.toivonen@hoxhunt.com>

* Update docs/sources/tempo/configuration/network/tls.md

Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* Update docs/sources/tempo/configuration/network/tls.md

Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* Add memcached config for TLS

---------

Co-authored-by: Markus Toivonen <markus.toivonen@hoxhunt.com>
Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* [DOC] Add TLS info to Helm chart doc (#4334)

* fix deprecation warning by switching to DoBatchWithOptions (#4343)

Signed-off-by: Daniel Strobusch <1847260+dastrobu@users.noreply.github.com>

* bump dskit to v0.0.0-20241115082728-f2a7eb3aa0e9 to leverage benefits for context causes for DoBatch calls. (#4341)

See https://github.com/grafana/dskit/issues/576

Signed-off-by: Daniel Strobusch <1847260+dastrobu@users.noreply.github.com>

* Bump github.com/minio/minio-go/v7 from 7.0.70 to 7.0.80 (#4282)

* Bump github.com/minio/minio-go/v7 from 7.0.70 to 7.0.80

Bumps [github.com/minio/minio-go/v7](https://github.com/minio/minio-go) from 7.0.70 to 7.0.80.
- [Release notes](https://github.com/minio/minio-go/releases)
- [Commits](https://github.com/minio/minio-go/compare/v7.0.70...v7.0.80)

---
updated-dependencies:
- dependency-name: github.com/minio/minio-go/v7
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update serverless vendor

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zach Leslie <zach.leslie@grafana.com>

* update default config values to better align with production workloads (#4340)

* update default config values to better align with production workloads

* Update CHANGELOG.md and config docs

* Ingester memory improvements by adjusting prealloc (#4344)

* remove trace ids

Signed-off-by: Joe Elliott <number101010@gmail.com>

* linear buckets

Signed-off-by: Joe Elliott <number101010@gmail.com>

* changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* tuney tune

Signed-off-by: Joe Elliott <number101010@gmail.com>

* metric misses and increase pool size

Signed-off-by: Joe Elliott <number101010@gmail.com>

* lint

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Bump github.com/Azure/azure-sdk-for-go/sdk/azcore from 1.13.0 to 1.16.0 (#4302)

* Bump github.com/Azure/azure-sdk-for-go/sdk/azcore from 1.13.0 to 1.16.0

Bumps [github.com/Azure/azure-sdk-for-go/sdk/azcore](https://github.com/Azure/azure-sdk-for-go) from 1.13.0 to 1.16.0.
- [Release notes](https://github.com/Azure/azure-sdk-for-go/releases)
- [Changelog](https://github.com/Azure/azure-sdk-for-go/blob/main/documentation/release.md)
- [Commits](https://github.com/Azure/azure-sdk-for-go/compare/sdk/azcore/v1.13.0...sdk/azcore/v1.16.0)

---
updated-dependencies:
- dependency-name: github.com/Azure/azure-sdk-for-go/sdk/azcore
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update serverless vendor

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zach Leslie <zach.leslie@grafana.com>

* Use Prometheus fast regexp (#4329)

* basic integration

Signed-off-by: Joe Elliott <number101010@gmail.com>

* patch tests for new meaning

Signed-off-by: Joe Elliott <number101010@gmail.com>

* patch up more tests

Signed-off-by: Joe Elliott <number101010@gmail.com>

* add basic tests

Signed-off-by: Joe Elliott <number101010@gmail.com>

* changelog + docs

Signed-off-by: Joe Elliott <number101010@gmail.com>

* remove benches

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Cleaned up + tests

Signed-off-by: Joe Elliott <number101010@gmail.com>

* comment

Signed-off-by: Joe Elliott <number101010@gmail.com>

* lint

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Update docs/sources/tempo/traceql/_index.md

Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* comment

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>
Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* Fix broken link in service-graphs docs (#4351)

* Fix minor typo in TraceQL docs (#4356)

* Bump default memcached version (#4363)

* Exemplar fixes (#4366)

* Fix exemplars based on duration to convert to seconds, fix various other issues

* changelog

* fix: initialize histogram buckets to 0 to avoid them being downsampled (#4368)

* initialized histogram buckets to 0 to avoid them being downsampled

* Ingester/Generator Live trace cleanup (#4365)

* moved trace sizes somewhere shareable

Signed-off-by: Joe Elliott <number101010@gmail.com>

* use tracesizes in ingester

Signed-off-by: Joe Elliott <number101010@gmail.com>

* make tests work

Signed-off-by: Joe Elliott <number101010@gmail.com>

* trace bytes in generator

Signed-off-by: Joe Elliott <number101010@gmail.com>

* remove traceCount

Signed-off-by: Joe Elliott <number101010@gmail.com>

* live trace shenanigans

Signed-off-by: Joe Elliott <number101010@gmail.com>

* changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Update modules/generator/processor/localblocks/livetraces.go

Co-authored-by: Mario <mariorvinas@gmail.com>

* Update modules/ingester/instance.go

Co-authored-by: Mario <mariorvinas@gmail.com>

* Test cleanup. Add sz test, restore commented out and fix e2e

Signed-off-by: Joe Elliott <number101010@gmail.com>

* remove todo comment

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>
Co-authored-by: Mario <mariorvinas@gmail.com>

* Bump anchore/sbom-action from 0.17.7 to 0.17.8 (#4371)

Bumps [anchore/sbom-action](https://github.com/anchore/sbom-action) from 0.17.7 to 0.17.8.
- [Release notes](https://github.com/anchore/sbom-action/releases)
- [Changelog](https://github.com/anchore/sbom-action/blob/main/RELEASE.md)
- [Commits](https://github.com/anchore/sbom-action/compare/v0.17.7...v0.17.8)

---
updated-dependencies:
- dependency-name: anchore/sbom-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update for IDs change

* Only run blockbuilder if ingest enabled

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Alex Bikfalvi <alex.bikfalvi@grafana.com>
Signed-off-by: Daniel Strobusch <1847260+dastrobu@users.noreply.github.com>
Co-authored-by: Javier Molina Reyes <javiermolinar@live.com>
Co-authored-by: Zach Leslie <zach.leslie@grafana.com>
Co-authored-by: Joe Elliott <number101010@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ryan Perry <Rperry2174@gmail.com>
Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>
Co-authored-by: Suraj Nath <9503187+electron0zero@users.noreply.github.com>
Co-authored-by: Alex Bikfalvi <alex@bikfalvi.com>
Co-authored-by: Andrey Karpov <ndk@users.noreply.github.com>
Co-authored-by: Jennifer Villa <jvilla2013@gmail.com>
Co-authored-by: Martin Disibio <martin.disibio@grafana.com>
Co-authored-by: Markus Toivonen <markus.toivonen@hoxhunt.com>
Co-authored-by: Daniel Strobusch <1847260+dastrobu@users.noreply.github.com>
Co-authored-by: Carles Garcia <carles.garciacabot@grafana.com>

* [rhythm] Changes to simplify operations (#4389)

* Use mapping for assigning partitions

* Use mapping for assigning partitions in the generator too

* Add support for SASL auth to kafka clients

* Add metrics to ingest (#4395)

* [rhythm] Extract block-builder into its own module (#4396)

* Extract block-builder into its own module

* Update /operations and examples

* No ephemeral storage

* No rolling strategy either

* fmt and compile

* Address review comment

* [rhythm] Correctly pass start/end time when appending a trace (#4410)

* Correctly pass start/end times

* Different code, same result

* [rhythm] Multiple fixes to block-builder consumption (#4413)

* Multiple fixes to cycle consumption

* fmt

* happy now?

* ups

* Rhythm: Separate non-flushing local blocks processor to store new queue data for reads (#4411)

* wip: separate non-flushing local blocks processor to store new queue data for reads

* Make real config for non-flushing local blocks processor, optional, validate wal config and use defaults if needed

* Fix defaulting of second WAL config

* [rhythm] Make ID generator more robust (#4416)

* Make ID generator more robust

* Simplify

* Update to e50f5d96b

* Fix registering of kafka read client metrics (#4502)

* [rhythm] Make ID generator more robust (#4416) (#4507)

* Make ID generator more robust

* Simplify

* Removed references to Loki and Mimir (#4509)

Signed-off-by: Joe Elliott <number101010@gmail.com>

* [Rhythm] Block builder test updates (#4510)

* Make blockbuilder tests closer to real kafka and less implementation specific by always enabling support for consumer groups, call commit control func in order

* Verify last committed offset in each test

* hide test function

* lint

* lint

* [Rhythm] Block-builder consumption loop (#4480)

* Alternate block-builder consume

* Set timeout on PollFetches, reduce initial poll delay, update 1 test to work using real consumergroup functionality

* restore metrics

* Re-add original partition lag metric, polled in separate goroutine. Fix consume loop to only consume full-duration cycles for more determinism

* merge conflict

* Review feedback

* Review feedback

* Comment

* code cleanup, lint

* logs

* code cleanup

* lint

* Review feedback

* Remove missed lookback_on_no_commit config in e2e tests and regen manifest

* Review feedback

* Fix rewind to latest commit to init correctly, it didn't work in some clusters (#4532)

* [rhythm] merge main at 71e8531 (#4531)

* Fixes

* More fixes

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Alex Bikfalvi <alex.bikfalvi@grafana.com>
Signed-off-by: Daniel Strobusch <1847260+dastrobu@users.noreply.github.com>
Co-authored-by: Javier Molina Reyes <javiermolinar@live.com>
Co-authored-by: Zach Leslie <zach.leslie@grafana.com>
Co-authored-by: Joe Elliott <number101010@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ryan Perry <Rperry2174@gmail.com>
Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>
Co-authored-by: Suraj Nath <9503187+electron0zero@users.noreply.github.com>
Co-authored-by: Alex Bikfalvi <alex@bikfalvi.com>
Co-authored-by: Andrey Karpov <ndk@users.noreply.github.com>
Co-authored-by: Jennifer Villa <jvilla2013@gmail.com>
Co-authored-by: Martin Disibio <martin.disibio@grafana.com>
Co-authored-by: Markus Toivonen <markus.toivonen@hoxhunt.com>
Co-authored-by: Daniel Strobusch <1847260+dastrobu@users.noreply.github.com>
Co-authored-by: Carles Garcia <carles.garciacabot@grafana.com>
2025-01-10 16:05:42 +01:00
..