Skip to content

Commit af450e9

Browse files
Merge pull request #885 from felicialim/object_audio_element
Add new audio element type: OBJECT_BASED
2 parents 754ac0c + 23734f6 commit af450e9

1 file changed

Lines changed: 36 additions & 7 deletions

File tree

index.bs

Lines changed: 36 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -697,6 +697,9 @@ class AudioElementOBU() {
697697
else if (audio_element_type == SCENE_BASED) {
698698
AmbisonicsConfig ambisonics_config;
699699
}
700+
else if (audio_element_type == OBJECT_BASED) {
701+
ObjectsConfig objects_config;
702+
}
700703
else {
701704
leb128() audio_element_config_size;
702705
unsigned int (8 x audio_element_config_size) audio_element_config_bytes;
@@ -730,10 +733,11 @@ class ReconGainParamDefinition() extends ParamDefinition() {
730733
<dfn noexport>audio_element_type</dfn> specifies the audio representation of this [=Audio Element=], which is constructed from one or more [=Audio Substream=]s. Parsers SHOULD ignore [=Audio Element OBU=]s with an [=audio_element_type=] that they do not recognize.
731734

732735
<pre class = "def">
733-
audio_element_type: The type of audio representation.
734-
0 : CHANNEL_BASED
735-
1 : SCENE_BASED
736-
2~7 : Reserved for future use
736+
audio_element_type : The type of audio representation.
737+
0 : CHANNEL_BASED
738+
1 : SCENE_BASED
739+
2 : OBJECT_BASED
740+
3~7 : Reserved for future use
737741
</pre>
738742

739743
<dfn noexport for="audio_element_obu">codec_config_id</dfn> indicates the identifier for the codec configuration which this [=Audio Element=] refers to. Parsers SHOULD ignore [=Audio Element OBU=]s with a [=audio_element_obu/codec_config_id=] identifying a [=codec_id=] that they don't support.
@@ -744,7 +748,7 @@ audio_element_type: The type of audio representation.
744748

745749
<dfn noexport>num_parameters</dfn> specifies the number of [=Parameter Substream=]s that are used by the algorithms specified in this [=Audio Element=].
746750
- When [=audio_element_type=] = 0, this field SHALL be set to 0, 1, or 2.
747-
- When [=audio_element_type=] = 1, this field SHALL be set to 0.
751+
- When [=audio_element_type=] = 1 or 2, this field SHALL be set to 0.
748752
- Parsers SHALL support any value of [=num_parameters=].
749753

750754
NOTE: For a given [=audio_element_type=], a future version of the specification may define a new [=Parameter Substream=] which may be ignored by IA decoders compliant with this version of the specification. In that case, a new [=param_definition_type=] will be defined in a future version of [=Audio Element OBU=].
@@ -799,16 +803,16 @@ In this parameter definition,
799803

800804
<dfn noexport>param_definition_bytes</dfn> represents reserved bytes for future use when new [=param_definition_type=] values are defined. Parsers SHOULD ignore these bytes when they don't understand the parameter definition.
801805

802-
803806
<dfn noexport>scalable_channel_layout_config</dfn> is an instance of the [=ScalableChannelLayoutConfig()=] class, which provides the metadata required for combining the [=Audio Substream=]s referred to here in order to reconstruct a scalable channel layout.
804807

805808
<dfn noexport>ambisonics_config</dfn> is an instance of the [=AmbisonicsConfig()=] class, which provides the metadata required for combining the [=Audio Substream=]s referred to here in order to reconstruct an Ambisonics layout.
806809

810+
<dfn noexport>objects_config</dfn> is an instance of the [=ObjectsConfig()=] class, which provides the metadata required to reconstruct one or more objects from the referenced [=Audio Substream=].
811+
807812
<dfn noexport>audio_element_config_size</dfn> indicates the size in bytes of [=audio_element_config_bytes=].
808813

809814
<dfn noexport>audio_element_config_bytes</dfn> represents reserved bytes for future use when new [=audio_element_type=] values are defined. Parsers SHOULD ignore these bytes when they don't recognize a particular configuration.
810815

811-
812816
<dfn noexport>default_demixing_info_parameter_data</dfn> is an instance of the [=DefaultDemixingInfoParameterData()=] class, which provides the default demixing parameter data to apply to all audio samples when there are no [=Parameter Block OBU=]s (with the same [=ParamDefinition/parameter_id=] defined in this [=DemixingParamDefinition()=]) provided.
813817
- In this class, [=w_idx_offset=] in [=demixing_info_parameter_data=] SHALL be ignored.
814818
- Instead, [=default_w=] directly indicates the weight value [=w(k)|\(w(k)\)=].
@@ -1262,6 +1266,31 @@ If [=ambisonics_mode=] is equal to PROJECTION, this indicates that the Ambisonic
12621266

12631267
A scene-based [=Audio Element=] has only one [=Channel Group=], which includes all [=Audio Substream=]s that it refers to. The order of the [=Audio Substream=]s in the [=Channel Group=] SHALL conform to [[RFC-8486]].
12641268

1269+
### Objects Config Syntax and Semantics ### {#syntax-objects-config}
1270+
1271+
The <dfn noexport>ObjectsConfig()</dfn> class provides the configuration for an object-based [=Audio Element=]. This section specifies the syntax structure of the [=ObjectsConfig()=] class.
1272+
1273+
<b>Syntax</b>
1274+
1275+
```
1276+
class ObjectsConfig() {
1277+
leb128() objects_config_size;
1278+
unsigned int (8) num_objects;
1279+
unsigned int (8 x (objects_config_size - 1)) objects_config_extension_bytes;
1280+
}
1281+
```
1282+
1283+
<b>Semantics</b>
1284+
1285+
<dfn noexport>objects_config_size</dfn> indicates the size in bytes of the syntaxes immediately following this field up to and including [=objects_config_extension_bytes=]. Parsers SHOULD ignore bytes past the [=ObjectsConfig()=] syntax that they recognize.
1286+
1287+
<dfn noexport>num_objects</dfn> specifies the number of objects that the referenced [=Audio Substream=] provides the audio for. It SHALL NOT be set to 0.
1288+
1289+
- When [=num_objects=] = 1, the [=Audio Substream=] has one channel of audio, which SHALL be coded in mono mode.
1290+
1291+
- When [=num_objects=] = 2, the [=Audio Substream=] has two channels of audio, one for each object, which SHALL be coded in stereo mode.
1292+
1293+
<dfn noexport>objects_config_extension_byte</dfn> represents reserved bytes for future use. Parsers that don't understand these bytes SHOULD ignore them.
12651294

12661295
## Mix Presentation OBU Syntax and Semantics ## {#obu-mixpresentation}
12671296

0 commit comments

Comments
 (0)