The creation of 3D content is becoming increasingly important. The challenge is to exchange complex scenes smoothly between different platforms. Traditional, bundled file formats often struggle with compatibility, bloated workflows and performance issues. For this reason, a modular approach with JSON-linked 3D assets is gaining importance as a future-proof solution. As part of the European XReco project, Mediapro/Visyon has developed a new JSON-based standard to enable easy packaging and distribution of 3D, video and audio content – in the form of a browser-based editor that can be used to create interactive 3D stories for everyone — no coding required.

Smart 3D Asset Exchange for Web and Cloud — Powered by JSON

The widespread growth of 3D content creation and distribution—spanning fields from architecture to e-commerce—has highlighted a need for more flexible, standardized methods of exchanging complex scenes.

A model that displays smoothly in a desktop application may struggle with compatibility or performance issues in a web viewer, especially if all assets are bundled into a single file. Complex metadata, hierarchies, animations, and transformations further complicate synchronization across systems. Without a lightweight, reference-based approach, workflows become inefficient, prone to data redundancy, multiple conversions, and potential quality loss, ultimately increasing production time and costs while hindering more scalable, dynamic 3D ecosystems.

A JSON-based format that links to external GLB/GLTF resources offers a lighter, more modular approach, separating spatial information from visual content. This streamlined method simplifies interoperability, supports incremental updates, and fosters a scalable standard that can seamlessly integrate with diverse platforms, tools, and workflows. 

From the Mediapro/Visyon side, we have spearheaded the adoption of JSON as the common asset-exchange format within the European XReco project, which is now scheduled to conclude in October 2025. Our contribution includes delivering a clearly defined JSON schema capable of packaging 3D models, video and audio together, plus a browser-based visual editor that lets non-technical users drag-and-drop those assets and replay them instantly in compile-free web and mobile applications. This is how they developed this.   

Figure 1 - symbolic image by MediaPro-Visyon

Figure 1: symbolic image by MediaPro/Visyon

Current 3D content exchange situation

USD (Universal Scene Description), developed by Pixar, Adobe, Apple, Autodesk and NVIDIA has become a strong industry standard for managing and exchanging complex 3D scenes. While it offers extensive functionality and is widely adopted in film, animation, and advanced visualization, USD generally keeps references within its own technical ecosystem. Although it can link to external resources, these must still fit USD’s own conventions. This approach limits versatility, making it harder to integrate lightweight, web-friendly formats like GLB/GLTF, which are often better suited for immediate, dynamic content delivery. As a result, while USD is a significant step forward, its inward focus highlights the need for a complementary, more neutral and lightweight format that can handle external references more freely.

Self-contained ecosystem: USD focuses on its own environment, making it harder to integrate formats like GLB/GLTF directly.

Heavier files: Scenes can become large and complex, slowing down transfers and load times.

Specialized tools: USD often depends on specific software and libraries, reducing flexibility.

FBX, a widely adopted format in gaming, animation, and VFX, offers robust functionality but is limited by its proprietary nature and binary structure. While it integrates well with desktop tools, it’s less suited for newer, web-friendly standards like GLB/GLTF, adding unnecessary complexity and data overhead. As a result, its closed ecosystem and heavier format highlight the need for more open, lightweight solutions.

Private ecosystem: FBX relies on Autodesk’s standard, limiting seamless integration with diverse tools and formats.

Binary complexity: Its internal structure is less accessible, complicating parsing and conversion to web-friendly formats.

Reduced web adaptability: FBX is not optimized for fast, on-demand content delivery across browsers and mobile devices.

JSON one ring to rule them all, one ring to find them, one ring to bring them all

In an increasingly interconnected and platform-agnostic landscape, a JSON-based approach to referencing external GLB/GLTF assets emerges as a versatile, future-proof solution that harmonizes the once fragmented world of 3D content exchange. By allowing scene structure to remain independent from the actual models, teams can maintain cleaner pipelines where updates, version control, and integration with various workflows become seamless. Adopting an established standard like GLB/GLTF ensures broad compatibility and reduces the friction inherent in using multiple proprietary or legacy formats. Meanwhile, the innate scalability and modularity of referencing assets external to the scene file encourages iterative improvement and easy customization, whether for rapid prototyping in a design studio or dynamic asset management in a virtual showroom. Web compatibility, metadata support, and the efficiency of loading only what is needed on demand further expand the range of scenarios—from AR/VR experiences and online configurators to interactive training modules—where this format can excel, enabling creators, developers, and end-users to reap the benefits of a flexible, accessible, and evolution-ready 3D ecosystem.

Established Standard (GLB/GLTF):

By relying on GLB/GLTF as the external asset format, the approach leverages a well-established industry standard recognized for its efficient, compact, and interoperable representation of 3D models. GLB/GLTF is widely supported across various platforms, tools, and engines, ensuring that referencing these external assets maintains high compatibility, simplifies integration, and reduces the need for custom conversion or adaptation processes.

Scalability and Modularity:

The format allows adding, removing, or replacing references to external assets without disrupting the overall scene structure. This modularity simplifies maintenance, encourages iterative development, and ensures a flexible approach to content management.

The format only defines the scene’s structure—positions, hierarchies, and transformations—while leaving actual models out. These 3D assets are referenced as external GLB/GLTF files, allowing updates or replacements without altering the primary scene file.

XReco symbol

Figure 2: symbolic image by MediaPro/Visyon

Transparency and Versioning:

Each asset is explicitly referenced, enabling easy version management without duplicating data. This improves change tracking and simplifies long-term collaboration across teams.

Web Compatibility:

Because of JSON’s simplicity and alignment with current web technologies and libraries, it naturally supports interoperability with 3D viewers, online editors, and engines based on WebGL/WebGPU. This compatibility fosters a more fluid integration into the broader ecosystem of web-based 3D applications.

Associated Metadata:

Beyond structural information, the format can reference metadata linked to each asset (tags, authorship, licensing, update dates, vertex number, faces number), allowing contextual data to be preserved and accessed without affecting the visual content or the scene’s logic.

Easy Loading of Multiple Integral Assets:

By distributing the scene’s content into separate external GLB/GLTF files, each asset can be requested and loaded independently. This approach streamlines the loading process, allows for on-demand retrieval of only the necessary components, and ensures that even complex scenes remain manageable and performant across different platforms and devices.

As the 3D content ecosystem continues to evolve and expand across industries, the need for flexible, efficient asset management becomes increasingly critical. A JSON-based reference format for 3D assets represents more than just a technical solution – it embodies a fundamental shift in how we approach content delivery and scene management. By separating scene structure from asset data, leveraging the ubiquity of GLB/GLTF, and embracing a modular architecture, this approach addresses current challenges while remaining adaptable to future innovations.

Use Cases

Architecture, Engineering, and Construction (AEC) – BIM and IFC/GLB Transfer:

In architecture, industrial design, and engineering, BIM (Building Information Modeling) models serve as rich data sources describing buildings, infrastructure, and constructed environments. While formats like IFC (Industry Foundation Classes) standardize construction-related information, they are not always ideal for lightweight 3D visualization or integration into web-based or real-time rendering environments. By adopting a JSON-based approach referencing GLB/GLTF assets, IFC geometries can be efficiently translated into GLB while preserving the essential scene structure—object hierarchies, transformations, and spatial layouts—without carrying the full load of BIM details. For instance, a user could explore an architectural project through an interactive web viewer, accessing only the necessary elements on demand, all while keeping the BIM data synchronized and up to date. This approach streamlines transfers between IFC and GLB, providing a seamless experience that combines the precision of BIM modeling with the efficiency and flexibility of lightweight 3D visualization.

Games and Virtual Environments:

In video games and interactive applications, level layouts, object placement, and spatial logic can reside in a JSON “map” of the scene, while characters, props, textures, and other 3D assets are loaded as external GLB/GLTF files. This modular approach makes it easy to update or replace elements without modifying the overall game structure, simplifying maintenance, iteration, and scalability. On-demand asset streaming reduces initial load times, optimizing user experience and allowing complex levels to be gradually rendered according to context and performance requirements.

Museums and Object Libraries:

In educational, cultural, or commercial contexts, virtual museums and 3D object libraries must often present extensive collections. A JSON format referencing GLB/GLTF allows users to load only those objects they wish to explore, minimizing initial load times and improving navigation. For example, a virtual visitor can browse through a gallery of sculptures or industrial prototypes, and when stopping at a particular piece, the specific object and its associated metadata (creator, creation date, licensing, dimensions, materials) are loaded on demand. This approach preserves performance, prevents the need to reload entire scenes, and supports scalable, flexible integration with web platforms, mobile devices, and VR viewers, enhancing both interactivity and efficiency.

Beyond Traditional Geometry

Not all scene elements need to be standard polygonal meshes. By referencing advanced data types—such as Neural Radiance Fields (NeRFs), Gaussian Splatting representations, or any forthcoming 3D data formats—this JSON-based model accommodates emerging, non-standardized visualization methods as they evolve. If a viewer or platform does not currently support a particular asset type, it can gracefully default to displaying a simple bounding box, ensuring that the scene remains understandable and navigable. Over time, as new formats become industry-ready, developers can seamlessly integrate them into existing scenes without reengineering the entire structure. This forward-thinking approach embraces experimentation and continuous innovation, maintaining broad compatibility today and ensuring a smooth path for tomorrow’s breakthroughs in 3D rendering and visualization.

From architectural visualization to game development, from virtual museums to industrial applications, the format’s combination of simplicity, scalability, and extensibility provides a robust foundation for the next generation of 3D applications. As emerging technologies like Neural Radiance Fields and Gaussian Splatting continue to advance, this reference format’s ability to accommodate new asset types while maintaining backward compatibility ensures its relevance in an ever-changing technological landscape. The future of 3D content delivery lies not in monolithic file formats but in intelligent, adaptable reference systems that can evolve alongside the industry’s needs.

Cloud-Based Virtual Production and Descriptive Format Implementation

Distributed Rendering Architecture and Reference-Based Asset Management

Cloud-based virtual production leverages distributed rendering architectures that benefit significantly from reference-based descriptive formats. These JSON-based interchange formats implement a separation of concerns between scene hierarchy descriptors and actual 3D assets, enabling parallel processing pipelines with optimized resource allocation. By referencing external GLB/GLTF assets rather than embedding them directly, the system maintains a lightweight scene graph representation (typically <1MB) that can be rapidly transmitted across network nodes while the more substantial geometric data (often 10-500MB per asset) can be cached, versioned, and processed independently. This architecture supports multi-threaded rendering processes across heterogeneous cloud environments, achieving near-linear scaling in complex production scenarios. Technical implementations typically incorporate WebSocket-based state synchronization protocols that propagate only delta changes to the JSON descriptor, resulting in minimal bandwidth consumption during collaborative sessions while maintaining sub-millisecond latency for critical scene graph modifications.

Watch the tutorial about the XReco Tool XR Capsules by Visyon

API-Driven Integration with Emerging Rendering Technologies

Reference-based descriptive formats establish a technology-agnostic foundation that inherently accommodates rendering evolution through temporal decoupling of scene structure from visualization implementation. When the JSON descriptor solely references external assets by URI and transformation parameters, rendering pipeline advancements can be implemented transparently without requiring scene graph modifications. This architecture creates natural technological progression paths: as NeRF encoding algorithms advance from original density field representations to more efficient hash-encoded implementations, the same scene descriptor seamlessly adopts improved visual fidelity and performance characteristics without structural refactoring. Similarly, when Gaussian Splatting transitions from first-generation point-based primitives to advanced hierarchical structures with anisotropic representations, existing scenes automatically inherit these improvements through renderer API updates while maintaining backward compatibility with legacy visualization systems. The format’s extensible type system allows progressive enhancement through optional rendering directives (e.g., {“renderingHint”: “prefer_gaussian_v2”, “fallback”: “gaussian_v1”}) that gracefully degrade when newer technologies are unavailable. This implementational approach creates a technical foundation where content creation remains insulated from rendering technology lifecycles—a production pipeline utilizing this JSON-based referential model effectively future-proofs creative assets, allowing scenes authored today to automatically leverage tomorrow’s visualization breakthroughs without requiring resource-intensive scene reconstruction or asset conversion.

Summary

Mediapro/Visyon has taken up the challenge of sharing 3D content across multiple platforms and developed a JSON-based format as part of the European XReco project. Their solution separates the scene structure from the 3D content and references external GLB/GLTF files to enable lightweight, modular and web-friendly workflows. In combination with the browser-based editor XRCapsule, this approach simplifies content management, improves interoperability and future-proofs 3D ecosystems for new technologies – especially for non-technical users.

About Visyon

Visyon, part of the Mediapro Group, contributes to XRECO by bringing its expertise in immersive media technologies and XR content creation. Within the project, Visyon plays a key role in the  validation of XR-based services and experiences, particularly focused on news broadcasting and tourism use cases. 

In addition, Visyon is developing XRCapsules, a lightweight authoring solution designed to allow non-technical users to compose and interact with XR content across platforms. XRCapsules enables content creators—such as journalists, producers or museum staff—to easily generate, edit, and distribute XR experiences without requiring specialized technical skills, contributing to XRECO’s goal of democratizing access to immersive media production.

By integrating its technological capabilities and creative vision, Visyon helps demonstrate how XRECO’s platform and services can be applied across industries, contributing significantly to the project’s ambition of transforming the European media landscape through innovative data sharing and immersive media formats.

Watch all tutorials of the XReco platform tools.

Never miss updates on the latest XReco developments.

Share this article!