Orfium は、最も強力なライセンス ソフトウェアを音楽制作業界に提供します。
シームレスなライセンス管理。
収益の最大化。
ライセンスのない使用からコンテンツを保護します。
カタログのパフォーマンスを完全に可視化します。
Soundmouse による自動キューシート管理。
無許可の使用から資産を保護しながら、クリエイターにシームレスにコンテンツのライセンスを付与します。 SyncTracker は、最も強力なライセンス ソフトウェアを音楽制作業界に提供します。
SyncTracker は、シームレスなエクスペリエンスのために UGC プラットフォームと直接統合し、Web インターフェイスだけでなく、Web サイト、サード パーティ ストア、または CRM への API 統合も可能です。
SyncTracker は、サブスクリプションおよびアラカルト ライセンス モデルに対応し、開始日と有効期限、使用制限、チャネル ライセンスなどを許可します。
Orfium は 2015 年にロサンゼルスで設立されました。Chris Mohoney と Drew Delis は、デジタル時代の音楽業界が直面する増大する課題に対応し、最先端のテクノロジーを音楽著作権管理分野に提供するという使命を担っていました。
Rob Wells は 2017 年にこのビジネスに参加し、会社をレコード レーベル、音楽出版社、音楽制作会社、収集団体にとって信頼できるパートナーに変える手助けをしました。
2021年、ブレーカーの買収により日本に進出。 これに続いて、2023 年にミュージック キュー シート レポートと音声認識のリーダーである Soundmouse が買収されました。 これにより、Orfium は製品を拡張し、市場をリードするテクノロジーとチームにより、より多くのエンターテイメント エコシステムにサービスを提供できるようになりました。
今日、私たちは音楽とエンターテイメント業界のすべての利害関係者をサポートすることを目指しています. 私たちは、コンテンツが消費されるたびにデータと収益を処理して、業界が可能な限り最善の方法でサポートされるように、グローバルなエンターテイメント エコシステム全体に組み込まれるというビジョンに向けて取り組んでいます。
私たちは、革新的なテクノロジーを通じてエンターテイメント エコシステムを改善することを使命としています。
私たちの旅に参加しましょう!
私たちが生み出している違いは、情熱的で献身的で、常に革新している従業員がいなければ不可能です。 私たちは、誰もが成長し、自分らしくいられる公正で透明な職場環境を作ることに取り組んでいます。
私たちはチームやクライアントと協力して、すべての利害関係者のためにエンターテインメント業界をサポートし、変革する市場をリードするソリューションを開発しています。
私たちは、アーティスト、ソングライター、プロデューサーが創作への情熱を追求できるよう支援することに重点を置いています。 私たちは、現在および将来のエンターテイメント業界をサポートするために必要なテクノロジーとインフラストラクチャの構築に取り組んでいます。
私たちは、専門知識、知識、透明性、セキュリティ、コンプライアンス、およびパフォーマンスを通じて、チーム、クライアント、および業界の信頼を獲得しています。
ギリシャのアテネを拠点とする 180 人以上の研究開発チームを擁するオルフィウムという名前は、音楽の才能で有名なギリシャの伝説の英雄オルフェウスに由来しています。
米国、英国、日本、ギリシャ、アイルランド、韓国にオフィスがあり、500 人以上の Orfiumer のチームが成長しています。
私たちは常に新しい才能を探しています。 音楽とエンターテイメントの分野で最も急速に成長しているテクノロジー企業の 1 つに参加することに関心がある場合は、ご連絡ください。
収益化のご支援します。
無許可の使用から資産を保護しながら、クリエイターにシームレスにコンテンツのライセンスを付与します。 SyncTracker は、最も強力なライセンス ソフトウェアを音楽制作業界に提供します。
SyncTracker は、シームレスなエクスペリエンスのために UGC プラットフォームと直接統合し、Web インターフェイスだけでなく、Web サイト、サード パーティ ストア、または CRM への API 統合も可能です。
SyncTracker は、サブスクリプションとアラカルトのライセンス モデルに対応し、開始日と有効期限、使用制限、チャネル ライセンスなどを許可します。
Today, we’re delighted to share that ORFIUM has acquired Soundmouse, bringing together the global market leaders in digital music and broadcast rights management and reporting.
Soundmouse is a global leader in music cue sheet reporting and monitoring for the broadcast and entertainment production space. They share our vision to revolutionize digital music and broadcast rights management and will join ORFIUM to deliver even more benefits to creators, rights holders, broadcasters and collecting societies with cutting edge technology and industry expertise.
Soundmouse has set the global standard for cue sheet and music reporting around the world, connecting all stakeholders in the reporting process including broadcasters, producers, collecting societies, distributors, program makers and music creators themselves. It works for major broadcasters, media companies and streaming platforms.
Bringing Soundmouse into the ORFIUM family, we’re moving to a place where we can serve the entire entertainment ecosystem across mainstream and digital media. By connecting creators, rights holders, and music users, we can deliver even more value to stakeholders across the board.
Combining Soundmouse’s leadership in cue sheet management and monitoring for the broadcast and entertainment production space and ORFIUM’s expertise in UGC tracking and claiming for publishers, labels and production music companies, we bring the worlds of digital and broadcast together in an integrated way. This will allow us to scale our product offering and expand deeper into the complex infrastructure of the entertainment industry, streamlining content creation and management for program makers, broadcasters and music rights holders.
Since 2015, ORFIUM has innovated to bring cutting edge technology to the music rights management space and has generated hundreds of millions of dollars in additional revenue for its partners, which includes top global record labels, music publishers, production music companies, and collecting societies.
Making music easier to find, use, track and monetize across all channels is one of the core problems we’re helping to solve for the industry. Acquiring Soundmouse enables us to scale our product offering and expand deeper into the complex infrastructure of the entertainment industry, streamlining content creation and management for program makers, broadcasters and music rights holders.
We’re committed to solving the entertainment industry’s most complex problems. We continue to develop technology solutions built on the latest in machine learning and AI, empowering rights owners, creators and key stakeholders to realize more value as new platforms for media consumption emerge and scale.
At a time when it has never been harder for creators, rights holders and media companies to track and monetize usage with the proliferation of new channels, platforms and the growth of the Metaverse and Web3, there is no other company in this space building and investing in technology like ORFIUM. ORFIUM is committed to delivering the technology needed to support the entertainment industry of today and the future.
Acquiring Soundmouse is a great start to the year for ORFIUM. We’re excited to welcome the Soundmouse team to join ours and integrate our combined technology, teams and expertise to bring even more value to the entertainment ecosystem.
Stay tuned, lots more still to come from ORFIUM in 2023!
The ORFIUM team
To learn more about how ORFIUM can support you in unlocking more value, contact our team today!
Black box testing is a software testing method in which the functionalities of software applications are tested without having knowledge of internal code structure, implementation details, and internal paths. Let’s borrow that term and use the same analogy to test our black boxes, meaning our dbt models.
So, adapting the lexicon from software engineering, we have:
Black Box: the dbt model that plays the role of transformation
Input(s): the different tables that are used in the query. In dbt we call these sources or references
Output: the table formed after the dbt model has transformed the data
DBT is a data build tool. We use it, due to its simplicity, to perform transformation in our snowflake data cloud for analytical purposes. Not to overstate the matter, but we love it.
Building a Mature Analytics Workflow
DBT already offers dbt tests, and performing them is a great way to test the data on your tables. By default, the available tests are unique, not_null, accepted_values, relationship. We can even create custom tests, and there are a variety of extensions out there that stretch dbt functionality with additional tests, such as great expectations and dbt-utils. These kinds of tests examine the values of your tables, and they are a great way to identify any critical data quality issues. DBT tests look at the output. However, what we want to do is to test the black box, the transformation.
More often than not, the tables that we need to build models upon are huge, and accessing billion of rows and performing transformation upon them takes a long time. A 30 minute transformation might be acceptable when it is a part of a production pipeline, but having to wait for half an hour to develop and test the correctness of your transformation is, well, less than ideal.
Of course you are going to run it against the table, but minimizing the number of runs makes everyone happy. This also limits your Snowflake Warehouse Usage which can save cost and make accountants happy as well.
Another problem we often face is having a dbt model that works for all intents and purposes for multiple months, only to later discover that there are cases which we didn’t think of. Unsurprisingly, having billions of rows of data means that all the possible scenarios are not at all easy to cover. If only there was a way to test for those cases, as well. The solution we at Orfium use is to generate mock data. They may not be real, but they work well enough to cover our edge cases and future-proof our dbt instances.
Writing tests for the sake of writing them is worse than not writing them at all. There, we said it.
Let’s face it, how many times do we introduce tests on a piece of software, get excited and, thanks to the quick TDD process, we just gleam with self-confidence? Before you know it, we’re writing tests that have no value at all and inventing a fantastic metric called coverage. Coverage is important but not as a single metric. It is only a first indication and should not be used as a goal in itself. Good tests are the ones that provide value. Bad tests, on the other hand, only add to the debt and the maintenance. Remember, tests are a means to an end. To what end? Writing robust code.
How many times have we found ourselves sitting in a room with a stakeholder who provides information about a new report that they need. We start formulating questions, and after some back and forth, sooner or later we are reaching the final requirements of the report. So, happily enough after the meeting, we go to our favorite warehouse, only to discover some flaw in the original request that we didn’t think of when we did our requirements gathering. Working in an agile environment that’s no issue. We just schedule a follow-up meeting and reach a consensus for the edge cases. Final delivery is reached. However, wouldn’t it be better if actual cases could be drafted in that first meeting? Business and engineering minds often don’t mesh well, so we can use all the help we can get.
Establishing actual scenarios of how a table could look like and what the result would be, helps a lot in the process of gathering requirements.
Consider the following imaginary scenario:
Stakeholder:
For our table that contains our daily revenue for all the videos, I would like a monthly summary revenue per video for advertisement category.
Engineer (gotcha):
1select
2 video_id,
3 year(date_rev) as year,
4 month(date_rev) as month,
5 sum(revenue) revenue
6from
7 fct_videos_rev
8where
9 category = 'advertisement'
10group by
11 video_id,
12 year(date_rev),
13 month(date_rev)
14
Stakeholder
I would also like to see how many records the summation was comprised of.
Engineer (gotcha):
1select
2 video_id,
3 year(date_rev) as year,
4 month(date_rev) as month,
5 sum(revenue) revenue,
6 count(*) counts
7from
8 fct_videos_rev
9where
10 category = 'advertisement'
11group by
12 video_id,
13 year(date_rev),
14 month(date_rev)
15
Stakeholder
That can’t be right. Why so many counts?
Engineer
There are many rows with zero revenues, I see. You don’t want them to count towards your total count, is that right?
Stakeholder
Yes.
Engineer (gotcha):
1select
2 video_id,
3 year(date_rev) as year,
4 month(date_rev) as month,
5 sum(revenue) revenue,
6 count(*) counts
7from
8 fct_videos_rev
9where
10 category = 'advertisement'
11 and revenue > 0
12group by
13 video_id,
14 year(date_rev),
15 month(date_rev)
16
Of course, this is an exaggerated example. However, imagine if the same dialog went a different way.
Stakeholder:
For our table that contains our daily revenue for all the videos, I would like a monthly summary on a monthly basis per video for the advertisement category.
Engineer:
If table has the form:
video_id | date_rev | category | revenue |
video_a | 2022-02-12 | advertisement | 10 |
video_a | 2022-02-12 | advertisement | 0 |
video_a | 2022-03-12 | subscription | 15 |
video_a | 2022-03-12 | advertisement | 1 |
Is the result you want like the following?
video_id | year | month | revenue |
video_a | 2022 | 02 | 10 |
video_a | 2022 | 03 | 1 |
Stakeholder
I would also like to see how many records the summation was comprised of.
So the result you want it to be like:
video_id | year | month | revenue | counts |
video_a | 2022 | 02 | 10 | 2 |
video_a | 2022 | 03 | 1 | 1 |
Stakeholder
Why does the first row have 2 counts?
Engineer
There are two with zero revenues, I see. You don’t you want them to count towards your total count, is that right?
Stakeholder
Yes.
Engineer (gotcha):
video_id | year | month | revenue | counts |
video_a | 2022 | 02 | 10 | 1 |
video_a | 2022 | 03 | 1 | 1 |
And all that, without having to write a single line of code. Not that an engineer is afraid to write SQL queries. But really, a lot of time is lost in translating business requirements into SQL queries. They are never that simple and they are almost never correct at first try either.
Orfium is a company which, at the time of writing this post, consists of more than 150 engineers. Only 6 of those are data engineers. That might sound strange, given that we are a data-heavy company dealing with billions of rows of data on a monthly basis. So, a new initiative has emerged called data-mesh. This is a program which we practice on a daily basis and are super proud of. One consequence of data mesh is that there are multiple teams handling their own instance of dbt. But, this will be discussed in detail in another post. Stay tuned!
For the most part, software engineers are not familiar with writing complex SQL queries. That’s not their fault, due to the variety of ORM tools available. However, something that software engineers do know how to do very well is to write tests.
In order to bridge that gap, practicing test-driven development on writing SQL is something that can help a lot of engineers to get onboard.
We designed a way to test dbt models (the black box). Our main drivers are:
We start by introducing the following macros:
1{%- macro ref_t(table_name) -%}
2 {%- if var('model_name','') == this.table -%}
3 {%- if var('test_mode',false) -%}
4 {%- if var('test_id','not_provided') == 'not_provided' -%}
5 {%- do exceptions.warn("WARNING: test_mode is true but test_id is not provided, rolling back to normal behavior") -%}
6 {{ ref(table_name) }}
7 {%- else -%}
8 {%- do log("stab ON, replace table: ["+table_name+"] --> ["+this.table+"_MOCK_"+table_name+"_"+var('test_id')+"]", info=True) -%}
9 {{ ref(this.table+'_MOCK_'+table_name+'_'+var('test_id')) }}
10 {%- endif -%}
11 {%- else -%}
12 {{ ref(table_name) }}
13 {%- endif -%}
14 {%- else -%}
15 {{ ref(table_name) }}
16 {%- endif -%}
17
18{%- endmacro -%}
19
20{%- macro source_t(schema, table_name) -%}
21
22 {%- if var('model_name','') == this.table -%}
23 {%- if var('test_mode',false) -%}
24 {%- if var('test_id','not_provided') == 'not_provided' -%}
25 {%- do exceptions.warn("WARNING: test_mode is true but test_id is not provided, rolling back to normal behavior") -%}
26 {{ builtins.source(schema,table_name) }}
27 {%- else -%}
28 {%- do log("stab ON, replace table: ["+schema+"."+table_name+"] --> ["+this.table+"_MOCK_"+table_name+"_"+var('test_id')+"]", info=True) -%}
29 {{ ref(this.table+'_MOCK_'+table_name+'_'+var('test_id')) }}
30 {%- endif -%}
31 {%- else -%}
32 {{ builtins.source(schema,table_name) }}
33 {%- endif -%}
34 {%- else -%}
35 {{ builtins.source(schema,table_name) }}
36 {%- endif -%}
37
38{%- endmacro -%}
The macros are able to optionally change the behavior of the macros of source and ref.
Prefer multiple small test cases over few large test cases
Test cases should test something specific. Generating Mock data that contain hundreds of records that test multiple business rules should be avoided. Should the test case fail, it should be easy to identify the cause and its impact.
Suppose we would like to create a test_id with the name: MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE for our model VIDEOS_INFO_SUMMARY which uses a source VIDEOS_INFO
We create a new folder under seeds MOCK_VIDEOS_INFO_SUMMARY
1VIDEO_ID,DATE_REV,CATEGORY,REVENUE
2video_a,2022-02-12,advertisement,10
3video_a,2022-02-12,advertisement,0
4video_a,2022-03-12,other,15
5video_a,2022-03-12,advertisement,1
1VIDEO_ID,YEAR,MONTH,REVENUE,COUNTS
2video_a,2022,2,10,1
3video_a,2022,3,1,1
1version: 2
2
3seeds:
4 - name: VIDEOS_INFO_SUMMARY_MOCK_RESULTS_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE
5 config:
6 enabled: "{{ var('test_mode', false) }}"
7
8 - name: VIDEOS_INFO_SUMMARY_MOCK_VIDEOS_INFO_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE
9 config:
10 enabled: "{{ var('test_mode', false) }}"
Notice that the seeds are created only on test_mode. This allows us to omit creating those seeds on default behavior.
1models:
2 - name: VIDEOS_INFO_SUMMARY
3 description: "Summary of VIDEOS_INFO"
4 tests:
5 - dbt_utils.equality:
6 tags: ['test_VIDEOS_INFO_SUMMARY_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE']
7 compare_model: ref('VIDEOS_INFO_SUMMARY_MOCK_RESULTS_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE')
8 compare_columns:
9 - VIDEO_ID
10 - YEAR
11 - MONTH
12 - REVENUE
13 - COUNTS
14 enabled: "{{ var('test_mode', false) }}"
1{{
2 config
3 (
4 materialized = 'table'
5 )
6}}
7
8SELECT
9 VIDEO_ID,
10 YEAR(DATE_REV) AS YEAR,
11 MONTH(DATE_REV) AS MONTH,
12 SUM(REVENUE) REVENUE,
13 COUNT(*) COUNTS
14FROM
15 {{ source_t('MY_SCHEMA','VIDEOS_INFO') }}
16WHERE
17 CATEGORY = 'advertisement'
18 AND REVENUE > 0
19GROUP BY
20 VIDEO_ID,
21 YEAR(DATE_REV),
22 MONTH(DATE_REV)
Notice the source_t usage instead of using the default source macro.
Now in order to follow the test process we have to go through the following process.
1dbt seed –full-refresh -m MOCK_VIDEOS_INFO_SUMMARY –vars ‘{“test_mode”:true}’
1dbt run -m VIDEOS_INFO_SUMMARY –vars ‘{“test_mode”:true,”test_id”:”MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE”,”model_name”:”VIDEOS_INFO_SUMMARY”}’
1dbt test –select tag:test_VIDEOS_INFO_SUMMARY_MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE –vars ‘{“test_mode”:true,”test_id”:”MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE”,”model_name”:”VIDEOS_INFO_SUMMARY”}’
Note: Because the whole process is a bit tedious with writing all those big commands, we wrote a bash script which automates all three steps:
The requirement is to create a file conf_test/tests_definitions.csv which has the format:
1# MODEL_NAME,TEST_ID
2VIDEOS_INFO_SUMMARY,MULTIPLE_VIDEOS_HAVE_ZERO_REVENUE
In the whole set-up described above there are some conventions that are important to be followed, otherwise the script/macros might not work
All the code exists in the following repo: https://github.com/vasilisgav/dbt_tdd_example – Connect to preview
To see it in practice:
What we have found by working with this approach, as it is expected with any TDD approach. The result was a big win into how we release our dbt models
Pros
Cons:
So, what are the key takeaways? That testing is important but good, smart testing can truly free an organization of a lot of daily tedium and allow it, as it has us, to focus more on serving the business efficiently and with the least amount of friction.