連絡をとる

連絡をとる

ORFIUM プロダクション音楽会社の場合

ライセンス管理を自動化し、収益と効率を高め、カタログの健全性とパフォーマンスを完全に可視化します。

私たちに参加

より多くの収益。より多くの視認性。より多くの信頼。

Orfium は、最も強力なライセンスソフトウェアを音楽制作業界に提供します。

bullet シームレスなライセンス管理。
bullet 収益の最大化。
bullet ライセンスのない使用からコンテンツを保護します。
bullet カタログのパフォーマンスを完全に可視化します。
bullet Soundmouse による自動キューシート管理。

制作音楽会社向けの SYNCTRACKER

自信を持ってカタログを管理するために必要なすべて。

プロアクティブなUGCクリアランス

当社の SyncTracker 製品を使用して UGC プラットフォームのライセンスをクリアすることで、頭痛の種やカスタマーサービスのメールを減らすことができます。

強力なレポート

重要なライセンス分析を簡単に表示および追跡し、使用状況、アセット、ビデオ、およびライセンス番号をチャネルまたはユーザーごとに確認します。

シームレスな同期

Orfium の API は UGC プラットフォームと直接連携するため、各ライセンスはビジネスや顧客のニーズに合わせて適切に調整されます。

年中無休のオペレーション

Orfium のテクノロジーは 24 時間体制で機能するため、その必要はありません。 Web UI または直接 API を介して、管理された自動ライセンスソリューションを提供します。

グローバルチーム

世界中にオフィスがあり、どのタイムゾーンでも何か問題が発生した場合は、チームのメンバーがいつでもお手伝いします。

特注ソリューション

お客様のビジネスには非常に具体的なニーズがあることを承知しており、お客様の個々のビジネスニーズに合わせてカスタマイズされたソリューションでサポートします。

ORFIUM を選ぶ理由

音楽制作会社が Orfium を選ぶ理由

収益

Orfium は一貫してより多くの収益を提供し、平均で 56% の収益への影響を与えているため、大手企業は自社のカタログの管理に Orfium を信頼しています。

信頼

信頼できるパートナーと協力して、クリエイターと権利所有者が公正かつシームレスに期日どおりに支払いを受けることができるよう支援してください。

テクノロジー

真のテクノロジー企業とのパートナーシップ。成長を続ける 180 人以上の技術チームは、業界で最高の技術を構築し続けています。

人々

人との関係は、私たちのビジネスの中心です。私たちは、従業員と、パートナーと築いた信頼関係を大切にしています。

制作音楽ソリューション

SyncTracker

無許可の使用から資産を保護しながら、クリエイターにシームレスにコンテンツのライセンスを付与します。 SyncTracker は、最も強力なライセンスソフトウェアを音楽制作業界に提供します。

SyncTracker は、シームレスなエクスペリエンスのために UGC プラットフォームと直接統合し、Web インターフェイスだけでなく、Web サイト、サードパーティストア、または CRM への API 統合も可能です。

SyncTracker は、サブスクリプションおよびアラカルトライセンスモデルに対応し、開始日と有効期限、使用制限、チャネルライセンスなどを許可します。

クライアントの声

私たちの制作音楽パートナーはそれをよりよく言います

Orfium has been a key partner in our digital rights success at Audio Network. The SyncTracker software has enabled us to improve internal workflows, supply key data, and create a seamless experience for our clients. Orfium’s constant willingness to improve the software and engage in active feedback has been invaluable and has allowed us to stay at the forefront of digital music rights management to deliver the optimal experience to our customers and maximize revenues for our composers and artists.

Nick Platt

Director, Strategic Licensing and Partnerships

Audio Network

Orfium has been an essential partner for Audiosocket. Their Sync Tracker software is game-changing. It releases claims before they’re ever even made on our clients’ channels. This has enabled us to partner with global web media companies for music licensing while still protecting our copyrights. The identification tools, reporting and analytics are industry-leading and have led to more identified uses of our works that can be monetized.

Jenn Anderson-Miller

CEO

AudioSocket

音楽制作業界向けの最も強力なライセンシングソフトウェアを発見してください。

私たちに参加

エンタメ業界の課題を解決最大の課題

連絡をとる

ORFIUM

音楽とテクノロジーへの情熱に支えられた企業

Orfium は、市場をリードするテクノロジーで音楽およびエンターテイメント業界を変革しています。最先端の人工知能 (AI) アルゴリズムと音楽レポートソリューションを設計しているため、世界中で音楽が再生されるたびに、舞台裏で追跡し、データを配信し、クリエイター、権利所有者、メディア企業を支援しています。使用状況を追跡、報告、収益化します。

私たちの情熱は、すべての利害関係者のためにエンターテインメント業界を強化するソフトウェアを構築することです。

タイムライン

The Orfium story

2016

クリス・モホーニーとドリュー・デリスが LA で Orfium を開始

2017

エンジニアのチームがギリシャのアテネに設立されました

2018

Rob Wells が Orfium に CEO として参加

2019

Orfium が最初の B2B クライアントである Sony と契約

2020

Orfium は、米国とヨーロッパで顧客ベースを拡大しています

2021

Orfium が Breaker の買収により日本に進出

2022

Orfium は、権利所有者と作成者に 2 億ドル以上の収益をもたらします

2023

Orfium は、キューシートと音楽レポートの世界的リーダーである Soundmouse を買収します。

Today

Orfium には 100 を超えるクライアントと 500 を超えるグローバルチームがいます

2016

クリス・モホーニーとドリュー・デリスが LA で Orfium を開始

2017

エンジニアのチームがギリシャのアテネに設立されました

2018

Rob Wells が Orfium に CEO として参加

2019

Orfium が最初の B2B クライアントである Sony と契約

2020

Orfium は、米国とヨーロッパで顧客ベースを拡大しています

2021

Orfium が Breaker の買収により日本に進出

2022

Orfium は、権利所有者と作成者に 2 億ドル以上の収益をもたらします

2023

Orfium は、キューシートと音楽レポートの世界的リーダーである Soundmouse を買収します。

Today

Orfium には 100 を超えるクライアントと 500 を超えるグローバルチームがいます

Orfiumについて

これまでの旅…

Orfium は 2015 年にロサンゼルスで設立されました。Chris Mohoney と Drew Delis は、デジタル時代の音楽業界が直面する増大する課題に対応し、最先端のテクノロジーを音楽著作権管理分野に提供するという使命を担っていました。

Rob Wells は 2017 年にこのビジネスに参加し、会社をレコードレーベル、音楽出版社、音楽制作会社、収集団体にとって信頼できるパートナーに変える手助けをしました。

2021年、ブレーカーの買収により日本に進出。これに続いて、2023 年にミュージックキューシートレポートと音声認識のリーダーである Soundmouse が買収されました。これにより、Orfium は製品を拡張し、市場をリードするテクノロジーとチームにより、より多くのエンターテイメントエコシステムにサービスを提供できるようになりました。

今日、私たちは音楽とエンターテイメント業界のすべての利害関係者をサポートすることを目指しています. 私たちは、コンテンツが消費されるたびにデータと収益を処理して、業界が可能な限り最善の方法でサポートされるように、グローバルなエンターテイメントエコシステム全体に組み込まれるというビジョンに向けて取り組んでいます。

私たちは、革新的なテクノロジーを通じてエンターテイメントエコシステムを改善することを使命としています。

私たちの旅に参加しましょう！

私たちがここにいる理由

我々の使命

私たちは、データとテクノロジーを通じて、音楽とエンターテイメントへの情熱を後押しします。

先を見据えて

私たちのビジョン

私たちのビジョンは、コンテンツが消費されるたびにデータと収益を処理して、業界が可能な限り最良の方法でサポートされるように、グローバルなエンターテインメントエコシステム全体に組み込まれることです。

ORFIUM での生活

私たちは音楽愛好家であり、開発者であり、データサイエンティストであり、デザイナーであり、すべての人のためにエンターテイメント業界を改善するために協力しています。

私たちに参加

人々

人と人間関係
私たちのビジネスの中心です。

私たちが生み出している違いは、情熱的で献身的で、常に革新している従業員がいなければ不可能です。私たちは、誰もが成長し、自分らしくいられる公正で透明な職場環境を作ることに取り組んでいます。

イノベーション

私たちは一緒にいる方が良いです

私たちはチームやクライアントと協力して、すべての利害関係者のためにエンターテインメント業界をサポートし、変革する市場をリードするソリューションを開発しています。

情熱

私たちは音楽とテクノロジーへの情熱に突き動かされています。

私たちは、アーティスト、ソングライター、プロデューサーが創作への情熱を追求できるよう支援することに重点を置いています。私たちは、現在および将来のエンターテイメント業界をサポートするために必要なテクノロジーとインフラストラクチャの構築に取り組んでいます。

信頼

私たちは信頼できるアドバイザーとして行動します。

私たちは、専門知識、知識、透明性、セキュリティ、コンプライアンス、およびパフォーマンスを通じて、チーム、クライアント、および業界の信頼を獲得しています。

ORFIUM 名前

Orfium は、音楽の才能で有名な伝説のギリシャの英雄、オルフェウスに由来します。

ギリシャのアテネを拠点とする 180 人以上の研究開発チームを擁するオルフィウムという名前は、音楽の才能で有名なギリシャの伝説の英雄オルフェウスに由来しています。

米国、英国、日本、ギリシャ、アイルランド、韓国にオフィスがあり、500 人以上の Orfiumer のチームが成長しています。

私たちは常に新しい才能を探しています。音楽とエンターテイメントの分野で最も急速に成長しているテクノロジー企業の 1 つに参加することに関心がある場合は、ご連絡ください。

参加しませんか

チーム

1 つのグローバルチーム

500+

Orfiumers

Six

国々

75%

Orfiumers は、ここで働く最大の理由として「私たちの人々」を挙げています

募集中です！

参加しませんか

私たちのチームから聞く

Orfium を友人にどのように説明しますか?

「健康的なワークライフバランスをサポートする素晴らしい企業文化の中で働くのに最適な場所です。誰もが音楽に非常に情熱的であり、それが示されています。仕事以外の友情にまで及ぶ、同僚との緊密な仕事上の関係を築くことができます。私たちが使用するツールは最先端のものであり、業界標準を設定しているため、将来の成長に向けて非常に有利な立場に立つことができます。」

Orfium 従業員、チーム調査 2022

私たちと一緒に創造してください

私たちと同じくらい音楽とテクノロジーが好きなら、Orfium をきっと気に入っていただけるでしょう。

利用可能なポジション

私たちは、生の才能、チームプレーヤー、そして大きな思想家を常に探しています。以下の募集職種をご覧ください。

今すぐ何も表示されない場合は、すぐに戻ってきてください。グローバルオフィス全体で常に新しい役割を追加しています

Product Manager – Music Industry Solutions (HYBRID)

Full-time

Product Manager

Athens, Attica, GR

learn more

Product Manager – Music Industry Solutions (HYBRID)

Full-time

Product Manager

Dublin, County Dublin, IE

learn more

Product Owner – Music Industry Solutions (HYBRID)

Full-time

Product Owner

Athens, Attica, GR

learn more

C# Software Developer

Engineering

Dublin, County Dublin, IE

learn more

Software Engineer in Test (HYBRID)

Full-time

QA Engineer

Athens, Attica, GR

learn more

Software Engineer in Test

Full-time

QA Engineer

Dublin, County Dublin, IE

learn more

Head of Information Technology & Security

Full-time

IT & Security

Athens, Attica, GR

learn more

連絡をとる

あなたからの御一報をお待ちしています。以下であなたの詳細を共有してください。私たちのチームメンバーの1人がすぐに連絡を取ります.

Los Angeles

The Enclave
22619 Pacific Coast Hwy
Suite B260 Malibu,
CA 90265

(310) 317-4227

Athens

Kallirois 103117 45
Athens, Greece

(+30) 21 0922 8995

London

26 Litchfield St
London WC2H 9TZ

Tokyo

Seizan #801 2-26-32
Tokyo 107-006 Japan

音楽、データ、権利。
収益化のご支援します。

すべてのチャネルで音楽コンテンツを検索、使用、追跡、収益化します。 Orfium は、デジタル音楽と放送権の管理、キューシート、データ、レポートに関するエンターテイメント業界の最大の課題を解決するグローバルテクノロジー企業です。

お問い合せ

エンターテインメント業界の課題解決

あらゆるビジネスに対するソリューション

音楽出版社

収益を最大化し、UGC チャネルを最適化し、キューシート管理を合理化します。

さらに表示

制作音楽会社

ライセンス管理を自動化し、UGC 収益を増やし、比類のないパフォーマンスレポートを取得します。

さらに表示

レコードレーベル

UGC の収益を増やし、業界をリードするレポートを取得し、キューシート管理を自動化します。

さらに表示

著作権管理団体

データを最適化し、操作を自動化および合理化し、Soundmouse のグローバルキューシートネットワークにアクセスします。

さらに表示

放送局

キューシートと制作データを合理化し、Soundmouse を使用してコンテンツを簡単に見つけてライセンス供与します。

さらに表示

デジタルサービスプロバイダー

ライセンスされたコンテンツに簡単にアクセスできます。ライセンスされた作品や録音に関するレポート、洞察、コンプライアンスを改善します。

さらに表示

ORFIUM を選ぶ理由

「ORFIUM」の違い

比類のない収益

最先端のテクノロジーと当社の業界専門家チームによって、より多くの収益と優れたサービスが提供されます。

強力なテクノロジー

業界をリードするテクノロジー、データ、音楽レポートは、成長を続ける 180 人以上の技術チームによって提供されます。

情熱的な人々

エンターテイメントエコシステムの強化

データ駆動型。結果重視。

21 M+

行われた請求

+56%

Orfium の平均収益への影響

8.9M+

処理されたライセンス

8.4M+

2022年に処理されたキューシート

3.2M+

解決された競合

1.2M+

ライセンスをお持ちのお客様

あなたの音楽コンテンツをコントロールしましょう

今日から始めましょう

製品の焦点

制作音楽会社向けの SyncTracker

SyncTracker は、サブスクリプションとアラカルトのライセンスモデルに対応し、開始日と有効期限、使用制限、チャネルライセンスなどを許可します。

注目の

クライアントの声

私たちのパートナーの声を聞いてください

Nick Platt

Director, Strategic Licensing and Partnerships

Audio Network

Jenn Anderson-Miller

CEO

AudioSocket

今日私たちと話してください

私たちに参加

Today, we’re delighted to share that ORFIUM has acquired Soundmouse, bringing together the global market leaders in digital music and broadcast rights management and reporting.

Who is Soundmouse?

Soundmouse is a global leader in music cue sheet reporting and monitoring for the broadcast and entertainment production space. They share our vision to revolutionize digital music and broadcast rights management and will join ORFIUM to deliver even more benefits to creators, rights holders, broadcasters and collecting societies with cutting edge technology and industry expertise.

Soundmouse has set the global standard for cue sheet and music reporting around the world, connecting all stakeholders in the reporting process including broadcasters, producers, collecting societies, distributors, program makers and music creators themselves. It works for major broadcasters, media companies and streaming platforms.

Why is ORFIUM acquiring Soundmouse?

Bringing Soundmouse into the ORFIUM family, we’re moving to a place where we can serve the entire entertainment ecosystem across mainstream and digital media. By connecting creators, rights holders, and music users, we can deliver even more value to stakeholders across the board.

Combining Soundmouse’s leadership in cue sheet management and monitoring for the broadcast and entertainment production space and ORFIUM’s expertise in UGC tracking and claiming for publishers, labels and production music companies, we bring the worlds of digital and broadcast together in an integrated way. This will allow us to scale our product offering and expand deeper into the complex infrastructure of the entertainment industry, streamlining content creation and management for program makers, broadcasters and music rights holders.

What does the Soundmouse acquisition mean for ORFIUM?

Since 2015, ORFIUM has innovated to bring cutting edge technology to the music rights management space and has generated hundreds of millions of dollars in additional revenue for its partners, which includes top global record labels, music publishers, production music companies, and collecting societies.

Making music easier to find, use, track and monetize across all channels is one of the core problems we’re helping to solve for the industry. Acquiring Soundmouse enables us to scale our product offering and expand deeper into the complex infrastructure of the entertainment industry, streamlining content creation and management for program makers, broadcasters and music rights holders.

What does the future look like for ORFIUM following the Soundmouse acquisition?

We’re committed to solving the entertainment industry’s most complex problems. We continue to develop technology solutions built on the latest in machine learning and AI, empowering rights owners, creators and key stakeholders to realize more value as new platforms for media consumption emerge and scale.

At a time when it has never been harder for creators, rights holders and media companies to track and monetize usage with the proliferation of new channels, platforms and the growth of the Metaverse and Web3, there is no other company in this space building and investing in technology like ORFIUM. ORFIUM is committed to delivering the technology needed to support the entertainment industry of today and the future.

Acquiring Soundmouse is a great start to the year for ORFIUM. We’re excited to welcome the Soundmouse team to join ours and integrate our combined technology, teams and expertise to bring even more value to the entertainment ecosystem.

Stay tuned, lots more still to come from ORFIUM in 2023!

The ORFIUM team

To learn more about how ORFIUM can support you in unlocking more value, contact our team today!

Black box testing for non-data engineers with DBT

Black box testing is a software testing method in which the functionalities of software applications are tested without having knowledge of internal code structure, implementation details, and internal paths. Let’s borrow that term and use the same analogy to test our black boxes, meaning our dbt models.

So, adapting the lexicon from software engineering, we have:

Black Box: the dbt model that plays the role of transformation

Input(s): the different tables that are used in the query. In dbt we call these sources or references

Output: the table formed after the dbt model has transformed the data

DBT and testing

DBT is a data build tool. We use it, due to its simplicity, to perform transformation in our snowflake data cloud for analytical purposes. Not to overstate the matter, but we love it.

What, exactly, is dbt?

Building a Mature Analytics Workflow

DBT already offers dbt tests, and performing them is a great way to test the data on your tables. By default, the available tests are unique, not_null, accepted_values, relationship. We can even create custom tests, and there are a variety of extensions out there that stretch dbt functionality with additional tests, such as great expectations and dbt-utils. These kinds of tests examine the values of your tables, and they are a great way to identify any critical data quality issues. DBT tests look at the output. However, what we want to do is to test the black box, the transformation.

TDD and Data

Working with Large Tables

More often than not, the tables that we need to build models upon are huge, and accessing billion of rows and performing transformation upon them takes a long time. A 30 minute transformation might be acceptable when it is a part of a production pipeline, but having to wait for half an hour to develop and test the correctness of your transformation is, well, less than ideal.

Of course you are going to run it against the table, but minimizing the number of runs makes everyone happy. This also limits your Snowflake Warehouse Usage which can save cost and make accountants happy as well.

Edge cases not covered in actual data

Another problem we often face is having a dbt model that works for all intents and purposes for multiple months, only to later discover that there are cases which we didn’t think of. Unsurprisingly, having billions of rows of data means that all the possible scenarios are not at all easy to cover. If only there was a way to test for those cases, as well. The solution we at Orfium use is to generate mock data. They may not be real, but they work well enough to cover our edge cases and future-proof our dbt instances.

Good Tests VS Bad Tests

Writing tests for the sake of writing them is worse than not writing them at all. There, we said it.

Let’s face it, how many times do we introduce tests on a piece of software, get excited and, thanks to the quick TDD process, we just gleam with self-confidence? Before you know it, we’re writing tests that have no value at all and inventing a fantastic metric called coverage. Coverage is important but not as a single metric. It is only a first indication and should not be used as a goal in itself. Good tests are the ones that provide value. Bad tests, on the other hand, only add to the debt and the maintenance. Remember, tests are a means to an end. To what end? Writing robust code.

Tests as a requirements gathering tool

How many times have we found ourselves sitting in a room with a stakeholder who provides information about a new report that they need. We start formulating questions, and after some back and forth, sooner or later we are reaching the final requirements of the report. So, happily enough after the meeting, we go to our favorite warehouse, only to discover some flaw in the original request that we didn’t think of when we did our requirements gathering. Working in an agile environment that’s no issue. We just schedule a follow-up meeting and reach a consensus for the edge cases. Final delivery is reached. However, wouldn’t it be better if actual cases could be drafted in that first meeting? Business and engineering minds often don’t mesh well, so we can use all the help we can get.

Establishing actual scenarios of how a table could look like and what the result would be, helps a lot in the process of gathering requirements.

Consider the following imaginary scenario:

Stakeholder:

For our table that contains our daily revenue for all the videos, I would like a monthly summary revenue per video for advertisement category.

Engineer (gotcha):

1select

2 video_id,

3 year(date_rev) as year,

4 month(date_rev) as month,

5 sum(revenue) revenue

6from

7 fct_videos_rev

8where

9 category = 'advertisement'

10group by 

11 video_id,

12 year(date_rev),

13 month(date_rev)

14

Stakeholder

I would also like to see how many records the summation was comprised of.

Engineer (gotcha):

1select

2 video_id,

3 year(date_rev) as year,

4 month(date_rev) as month,

5 sum(revenue) revenue,

6 count(*) counts

7from

8 fct_videos_rev

9where

10 category = 'advertisement'

11group by 

12 video_id,

13 year(date_rev),

14 month(date_rev)

15

Stakeholder

That can’t be right. Why so many counts?

Engineer

There are many rows with zero revenues, I see. You don’t want them to count towards your total count, is that right?

Stakeholder

Yes.

Engineer (gotcha):

1select

2 video_id,

3 year(date_rev) as year,

4 month(date_rev) as month,

5 sum(revenue) revenue,

6 count(*) counts

7from

8 fct_videos_rev

9where

10 category = 'advertisement'

11 and revenue > 0

12group by 

13 video_id,

14 year(date_rev),

15 month(date_rev)

16

Of course, this is an exaggerated example. However, imagine if the same dialog went a different way.

Stakeholder:

For our table that contains our daily revenue for all the videos, I would like a monthly summary on a monthly basis per video for the advertisement category.

Engineer:

If table has the form:

video_id	date_rev	category	revenue
video_a	2022-02-12	advertisement	10
video_a	2022-02-12	advertisement	0
video_a	2022-03-12	subscription	15
video_a	2022-03-12	advertisement	1

Is the result you want like the following?

video_id	year	month	revenue
video_a	2022	02	10
video_a	2022	03	1

Stakeholder

I would also like to see how many records the summation was comprised of.

So the result you want it to be like:

video_id	year	month	revenue	counts
video_a	2022	02	10	2
video_a	2022	03	1	1

Stakeholder

Why does the first row have 2 counts?

Engineer

There are two with zero revenues, I see. You don’t you want them to count towards your total count, is that right?

Stakeholder

Yes.

Engineer (gotcha):

video_id	year	month	revenue	counts
video_a	2022	02	10	1
video_a	2022	03	1	1

And all that, without having to write a single line of code. Not that an engineer is afraid to write SQL queries. But really, a lot of time is lost in translating business requirements into SQL queries. They are never that simple and they are almost never correct at first try either.

Tests so software engineers can get onboard in SQL

Orfium is a company which, at the time of writing this post, consists of more than 150 engineers. Only 6 of those are data engineers. That might sound strange, given that we are a data-heavy company dealing with billions of rows of data on a monthly basis. So, a new initiative has emerged called data-mesh. This is a program which we practice on a daily basis and are super proud of. One consequence of data mesh is that there are multiple teams handling their own instance of dbt. But, this will be discussed in detail in another post. Stay tuned!

For the most part, software engineers are not familiar with writing complex SQL queries. That’s not their fault, due to the variety of ORM tools available. However, something that software engineers do know how to do very well is to write tests.

In order to bridge that gap, practicing test-driven development on writing SQL is something that can help a lot of engineers to get onboard.

Let the fun begin

We designed a way to test dbt models (the black box). Our main drivers are:

Introduce a few changes so that new or mature projects can start using it, without breaking existing behavior.
Find a way to define test scenarios and identify which of them failed.

We start by introducing the following macros:

1{%- macro ref_t(table_name) -%}

2    {%- if var('model_name','') == this.table -%}

3        {%- if var('test_mode',false) -%}

4            {%- if var('test_id','not_provided') == 'not_provided' -%}

5                {%- do exceptions.warn("WARNING: test_mode is true but test_id is not provided, rolling back to normal behavior") -%}

6                {{ ref(table_name) }} 

7            {%- else -%}

8                {%- do log("stab ON, replace table: ["+table_name+"] --> ["+this.table+"_MOCK_"+table_name+"_"+var('test_id')+"]", info=True) -%}

9                {{ ref(this.table+'_MOCK_'+table_name+'_'+var('test_id')) }}

10            {%- endif -%}

11        {%- else -%}

12            {{ ref(table_name) }} 

13        {%- endif -%}

14    {%- else -%}

15        {{ ref(table_name) }} 

16    {%- endif -%}

17        

18{%- endmacro -%}

19

20{%- macro source_t(schema, table_name) -%}

21

22    {%- if var('model_name','') == this.table -%}

23        {%- if var('test_mode',false) -%}

24            {%- if var('test_id','not_provided') == 'not_provided' -%}

25                {%- do exceptions.warn("WARNING: test_mode is true but test_id is not provided, rolling back to normal behavior") -%}

26                {{ builtins.source(schema,table_name) }}

27            {%- else -%}

28                {%- do log("stab ON, replace table: ["+schema+"."+table_name+"] --> ["+this.table+"_MOCK_"+table_name+"_"+var('test_id')+"]", info=True) -%}

29                {{ ref(this.table+'_MOCK_'+table_name+'_'+var('test_id')) }}

30            {%- endif -%}

31        {%- else -%}

32            {{ builtins.source(schema,table_name) }}

33        {%- endif -%}

34    {%- else -%}

35        {{ builtins.source(schema,table_name) }}

36    {%- endif -%}

37        

38{%- endmacro -%}

The macros are able to optionally change the behavior of the macros of source and ref.

model_name: refers to the model actually been tested
test_mode: is a flag that helps identifying if the test_mode is enabled
test_id: the test scenario that is going to be mocked
table_name(argument): is the source table that is either going to be the true source, or we stab it and use one of our own.

Prefer multiple small test cases over few large test cases

Test cases should test something specific. Generating Mock data that contain hundreds of records that test multiple business rules should be avoided. Should the test case fail, it should be easy to identify the cause and its impact.

Suppose we would like to create a test_id with the name: MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE for our model VIDEOS_INFO_SUMMARY which uses a source VIDEOS_INFO

We create a new folder under seeds MOCK_VIDEOS_INFO_SUMMARY

We create the input seed seeds/MOCK_VIDEOS_INFO_SUMMARY/VIDEOS_INFO_SUMMARY_MOCK_VIDEOS_INFO_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE.csv which plays the role of input

1VIDEO_ID,DATE_REV,CATEGORY,REVENUE 

2video_a,2022-02-12,advertisement,10

3video_a,2022-02-12,advertisement,0

4video_a,2022-03-12,other,15

5video_a,2022-03-12,advertisement,1

We create the output seed seeds/MOCK_VIDEOS_INFO_SUMMARY/VIDEOS_INFO_SUMMARY_MOCK_RESULTS_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE.csv which plays the role of output we would like to have once

1VIDEO_ID,YEAR,MONTH,REVENUE,COUNTS

2video_a,2022,2,10,1

3video_a,2022,3,1,1

We also create a yml seeds/MOCK_VIDEOS_INFO_SUMMARY/VIDEOS_INFO_SUMMARY.yml as follows:

1version: 2

2

3seeds:

4  - name: VIDEOS_INFO_SUMMARY_MOCK_RESULTS_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE

5    config:

6      enabled: "{{ var('test_mode', false) }}"

7

8  - name: VIDEOS_INFO_SUMMARY_MOCK_VIDEOS_INFO_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE

9    config:

10      enabled: "{{ var('test_mode', false) }}"

Notice that the seeds are created only on test_mode. This allows us to omit creating those seeds on default behavior.

Now we define the test inside our yml model definition:

1models:

2  - name: VIDEOS_INFO_SUMMARY

3    description: "Summary of VIDEOS_INFO"

4    tests:

5        - dbt_utils.equality:

6            tags: ['test_VIDEOS_INFO_SUMMARY_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE']

7            compare_model: ref('VIDEOS_INFO_SUMMARY_MOCK_RESULTS_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE')

8            compare_columns:

9              - VIDEO_ID

10              - YEAR

11              - MONTH

12              - REVENUE

13              - COUNTS

14            enabled: "{{ var('test_mode', false) }}"

Our model:

1{{

2    config

3    (

4        materialized = 'table'

5    )

6}}

7

8SELECT

9 VIDEO_ID,

10 YEAR(DATE_REV) AS YEAR,

11 MONTH(DATE_REV) AS MONTH,

12 SUM(REVENUE) REVENUE,

13 COUNT(*) COUNTS

14FROM

15 {{ source_t('MY_SCHEMA','VIDEOS_INFO') }}

16WHERE

17 CATEGORY = 'advertisement'

18 AND REVENUE > 0

19GROUP BY 

20 VIDEO_ID,

21 YEAR(DATE_REV),

22 MONTH(DATE_REV)

Notice the source_t usage instead of using the default source macro.

Now in order to follow the test process we have to go through the following process.

Load up our seeds as:

1dbt seed –full-refresh -m MOCK_VIDEOS_INFO_SUMMARY –vars ‘{“test_mode”:true}’

Then execute our model as:

1dbt run -m VIDEOS_INFO_SUMMARY –vars ‘{“test_mode”:true,”test_id”:”MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE”,”model_name”:”VIDEOS_INFO_SUMMARY”}’

And then execute dbt test to check if our black box behaved as it should:

1dbt test –select tag:test_VIDEOS_INFO_SUMMARY_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE –vars ‘{“test_mode”:true,”test_id”:”MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE”,”model_name”:”VIDEOS_INFO_SUMMARY”}’

Note: Because the whole process is a bit tedious with writing all those big commands, we wrote a bash script which automates all three steps:

The requirement is to create a file conf_test/tests_definitions.csv which has the format:

1# MODEL_NAME,TEST_ID

2VIDEOS_INFO_SUMMARY,MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE

Script reads this file and executes all the tests defined in the file in order
Executing tests of only a specific model is supported by passing -m flag ./dbt_test.sh -m VIDEOS_INFO_SUMMARY
Executing a specific test case is supported by passing -t flag ./dbt_test.sh -t MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE
Lines that start with # are skipped

In the whole set-up described above there are some conventions that are important to be followed, otherwise the script/macros might not work

The seed folder must be named MOCK_{model_we_test}
The seed which plays the role of input must be named {model_we_test}_MOCK_{model_we_stab}_{test_id}
The result which plays the role of wanted result must be named {model_we_test}_MOCK_RESULTS_{test_id}

All the code exists in the following repo: https://github.com/vasilisgav/dbt_tdd_example – Connect to preview

To see it in practice:

set up a tdd_example profile
make sure you run dbt deps to install dbt_utils
make the script executable chmod +x dbt_test.sh
and finally execute the script ./dbt_test.sh

RESULTS:

What we have found by working with this approach, as it is expected with any TDD approach. The result was a big win into how we release our dbt models

Pros

models have grown to become quite clean with their business clearly depicted
business rules can easily be verified, especially their changes
business voids are identified faster
business requirements are generated in a cleaner, more efficient way
quick development, yes it’s surprising but we deal with billion of rows, the less runs we are going to perform on the full load of table the quicker the development
regression tests are handled by our github actions ensuring our models behave as expected (multiple puns here 😀 )
QA can happen independently of our dev
Warehouse usage is limited

Cons:

Tables sources with multiple columns sometimes are cumbersome to mock, although if columns are not selected then defining them in the mock csv’s is not required
It’s somewhat difficult to start

So, what are the key takeaways? That testing is important but good, smart testing can truly free an organization of a lot of daily tedium and allow it, as it has us, to focus more on serving the business efficiently and with the least amount of friction.

Vasilis Gavriilidis

Senior Data Engineer @ ORFIUM

https://www.linkedin.com/in/vgavriilidis/

https://github.com/vasilisgav/

ORFIUM プロダクション音楽会社の場合

より多くの収益。 より多くの視認性。 より多くの信頼。

自信を持ってカタログを管理するために必要なすべて。

音楽制作会社が Orfium を選ぶ理由

私たちの制作音楽パートナーはそれをよりよく言います

エンタメ業界の課題を解決 最大の課題

The Orfium story

私たちは、データとテクノロジーを通じて、音楽とエンターテイメントへの情熱を後押しします。

私たちのビジョンは、コンテンツが消費されるたびにデータと収益を処理して、業界が可能な限り最良の方法でサポートされるように、グローバルなエンターテインメント エコシステム全体に組み込まれることです。

ORFIUM での生活

人々

人と人間関係 私たちのビジネスの中心です。

イノベーション

私たちは一緒にいる方が良いです

情熱

私たちは音楽とテクノロジーへの情熱に突き動かされています。

信頼

私たちは信頼できるアドバイザーとして行動します。

1 つのグローバル チーム

募集中です！

Orfium 従業員、チーム調査 2022

私たちと一緒に創造してください

私たちは、生の才能、チーム プレーヤー、そして大きな思想家を常に探しています。 以下の募集職種をご覧ください。

連絡をとる

Los Angeles

Athens

London

Tokyo

音楽、データ、権利。 収益化のご支援します。

あらゆるビジネスに対するソリューション

「ORFIUM」の違い

データ駆動型。 結果重視。

私たちのパートナーの声を聞いてください

Who is Soundmouse?

Why is ORFIUM acquiring Soundmouse?

What does the Soundmouse acquisition mean for ORFIUM?

What does the future look like for ORFIUM following the Soundmouse acquisition?

Black box testing for non-data engineers with DBT

DBT and testing

TDD and Data

Working with Large Tables

Edge cases not covered in actual data

Good Tests VS Bad Tests

Tests as a requirements gathering tool

Tests so software engineers can get onboard in SQL

Let the fun begin

RESULTS:

より多くの収益。より多くの視認性。より多くの信頼。

エンタメ業界の課題を解決最大の課題

私たちのビジョンは、コンテンツが消費されるたびにデータと収益を処理して、業界が可能な限り最良の方法でサポートされるように、グローバルなエンターテインメントエコシステム全体に組み込まれることです。

人と人間関係
私たちのビジネスの中心です。

1 つのグローバルチーム

私たちは、生の才能、チームプレーヤー、そして大きな思想家を常に探しています。以下の募集職種をご覧ください。

音楽、データ、権利。
収益化のご支援します。

データ駆動型。結果重視。