Feb 2, 2026

RealityKit: AR and Spatial Computing Across All Apple Platforms

You want to place a 3D Pixar character on a user’s desk. SceneKit could do it five years ago, but Apple has made it clear: RealityKit is the future of 3D rendering across iOS, iPadOS, macOS, and visionOS. It shares the same entity-component system on every platform, which means the Entity you build for an AR experience on iPhone runs — with minimal changes — as a spatial experience on Apple Vision Pro. iOS 26 doubled down on this convergence by adding ManipulationComponent for direct gesture handling and MeshInstancesComponent for high-performance instanced rendering.

This post covers RealityKit’s entity-component architecture, building and loading 3D content, gesture-driven manipulation, instanced rendering, and environment occlusion. We will not cover visionOS-specific immersive spaces or Reality Composer Pro’s shader graph — those are handled in visionOS Primer and Reality Composer Pro. This assumes familiarity with SwiftUI basics and async/await.

Note: RealityKit is available on iOS 13+ but the APIs discussed in this post — particularly RealityView, ManipulationComponent, and MeshInstancesComponent — require iOS 18+ or iOS 26. Platform availability is noted for each API.

The Problem
Entity-Component Architecture
Loading and Displaying 3D Content
ManipulationComponent: Gesture-Driven Interaction
MeshInstancesComponent: Instanced Rendering
Environment Occlusion
Advanced Usage
Performance Considerations
When to Use (and When Not To)
Summary

The Problem

Imagine building a Pixar-themed AR exhibit app where visitors can place Buzz Lightyear, Woody, and the Toy Story aliens on tables around a museum. The naive approach is to load a model, add it to a scene, and wire up gesture recognizers manually:

// The SceneKit / pre-RealityView approach — tightly coupled and platform-specific
import SceneKit
import ARKit

class ExhibitViewController: UIViewController, ARSCNViewDelegate {
    let sceneView = ARSCNView()
    var buzzNode: SCNNode?

    override func viewDidLoad() {
        super.viewDidLoad()
        sceneView.delegate = self
        sceneView.scene = SCNScene()
        view.addSubview(sceneView)

        // Manual model loading with no async support
        guard let scene = SCNScene(named: "BuzzLightyear.usdz") else { return }
        buzzNode = scene.rootNode.childNode(withName: "Buzz", recursively: true)

        // Manual gesture recognition
        let pan = UIPanGestureRecognizer(target: self, action: #selector(handlePan))
        let pinch = UIPinchGestureRecognizer(target: self, action: #selector(handlePinch))
        let rotation = UIRotationGestureRecognizer(
            target: self,
            action: #selector(handleRotation)
        )
        sceneView.addGestureRecognizer(pan)
        sceneView.addGestureRecognizer(pinch)
        sceneView.addGestureRecognizer(rotation)
    }

    // 50+ lines of gesture handling, hit testing, node transforms...
}

This approach has compounding problems. SceneKit is UIKit-only, so you cannot embed it natively in SwiftUI without UIViewRepresentable. Gesture handling requires manual hit tests, transform math, and state tracking across three separate recognizer callbacks. And none of this code transfers to visionOS, where the interaction model is entirely different.

RealityKit solves all of these problems with a unified entity-component system, SwiftUI-native RealityView, declarative gesture support via ManipulationComponent, and cross-platform compatibility from iPhone to Apple Vision Pro.

Entity-Component Architecture

Apple Docs: Entity — RealityKit

RealityKit uses an entity-component-system (ECS) architecture. If you have worked with game engines like Unity or Unreal, the pattern will be familiar:

Entity: A container. It has a transform (position, rotation, scale) and an identity, but no behavior or appearance on its own.
Component: A bag of data that gives an entity capabilities — a mesh, a material, physics properties, accessibility traits, or custom app logic.
System: Logic that runs every frame and operates on entities with matching components.

Here is a minimal entity with a visual representation — a toy block for our Pixar exhibit:

import RealityKit

@available(iOS 18, *)
@MainActor
func createToyBlock() -> ModelEntity {
    let mesh = MeshResource.generateBox(size: 0.1, cornerRadius: 0.005)
    let material = SimpleMaterial(
        color: .init(red: 0.2, green: 0.4, blue: 0.9, alpha: 1.0),
        roughness: 0.3,
        isMetallic: false
    )
    let block = ModelEntity(mesh: mesh, materials: [material])
    block.name = "PixarToyBlock"
    return block
}

ModelEntity is a subclass of Entity that comes pre-loaded with ModelComponent (mesh + materials) and CollisionComponent infrastructure. For entities that do not need a visual representation — anchors, grouping nodes, audio sources — use the base Entity class.

Building Entity Hierarchies

Entities form a tree. Adding a child entity inherits the parent’s transform, creating a coordinate space hierarchy:

@available(iOS 18, *)
@MainActor
func createToyStoryScene() -> Entity {
    let root = Entity()
    root.name = "ToyStoryExhibit"

    // Andy's room floor
    let floor = ModelEntity(
        mesh: .generatePlane(width: 2.0, depth: 2.0),
        materials: [SimpleMaterial(color: .brown, isMetallic: false)]
    )
    floor.name = "AndysRoomFloor"
    root.addChild(floor)

    // Toy box in the corner
    let toyBox = ModelEntity(
        mesh: .generateBox(
            width: 0.3, height: 0.2, depth: 0.2, cornerRadius: 0.01
        ),
        materials: [SimpleMaterial(color: .red, isMetallic: false)]
    )
    toyBox.name = "ToyBox"
    toyBox.position = SIMD3(x: 0.8, y: 0.1, z: -0.8)
    root.addChild(toyBox)

    // Stack of blocks on top of the toy box
    for i in 0..<3 {
        let block = createToyBlock()
        block.position = SIMD3(
            x: 0.0,
            y: Float(i) * 0.11 + 0.15,
            z: 0.0
        )
        toyBox.addChild(block) // Position is relative to toyBox
    }

    return root
}

Custom Components

You can define custom components to attach app-specific data to entities. Components must conform to the Component protocol:

struct PixarCharacterComponent: Component {
    let characterName: String
    let movie: String
    let catchphrase: String
    var interactionCount: Int = 0
}

Attach it to an entity:

var buzzEntity = ModelEntity(
    mesh: .generateSphere(radius: 0.1),
    materials: [SimpleMaterial(color: .green, isMetallic: true)]
)
buzzEntity.components.set(PixarCharacterComponent(
    characterName: "Buzz Lightyear",
    movie: "Toy Story",
    catchphrase: "To infinity and beyond!"
))

Query components later when handling interactions:

if let character = entity.components[PixarCharacterComponent.self] {
    print("\(character.characterName): \(character.catchphrase)")
}

Loading and Displaying 3D Content

Apple Docs: RealityView — RealityKit

RealityView is the SwiftUI container for RealityKit content. It replaced ARView (UIKit) as the primary way to display 3D content starting in iOS 18:

import SwiftUI
import RealityKit

@available(iOS 18, *)
struct PixarExhibitView: View {
    var body: some View {
        RealityView { content in
            // Called once to set up the scene
            let exhibit = createToyStoryScene()
            content.add(exhibit)
        } update: { content in
            // Called when SwiftUI state changes
        }
        .frame(maxWidth: .infinity, maxHeight: .infinity)
    }
}

The make closure runs once when the view appears. The update closure runs whenever SwiftUI state driving the view changes — use it to synchronize SwiftUI state with entity properties.

Loading USDZ Models

For production content, you will load .usdz or .reality files rather than generating meshes in code. Entity(named:in:) is async and can load from your app bundle:

@available(iOS 18, *)
struct BuzzLightyearView: View {
    @State private var buzzEntity: Entity?

    var body: some View {
        RealityView { content in
            do {
                let buzz = try await Entity(
                    named: "BuzzLightyear",
                    in: Bundle.main
                )
                buzz.position = SIMD3(x: 0, y: 0, z: -1.0)
                buzz.scale = SIMD3(repeating: 0.5)
                content.add(buzz)
                buzzEntity = buzz
            } catch {
                print("Failed to load Buzz: \(error)")
            }
        }
    }
}

Tip: .usdz files can be previewed directly in Xcode and Finder. Use Reality Composer Pro to assemble complex scenes from multiple USDZ assets, add materials, and configure physics — then export a .reality file that loads faster at runtime.

ManipulationComponent: Gesture-Driven Interaction

Apple Docs: ManipulationComponent — RealityKit

iOS 26 introduced ManipulationComponent, which replaces the manual gesture recognizer wiring we saw in the problem section. Add it to an entity and it automatically supports drag, rotate, and scale gestures:

@available(iOS 26, *)
@MainActor
func createManipulableBuzz() async throws -> Entity {
    let buzz = try await Entity(named: "BuzzLightyear", in: Bundle.main)

    // Enable collision for gesture hit testing
    let bounds = buzz.visualBounds(relativeTo: nil)
    buzz.components.set(CollisionComponent(
        shapes: [.generateBox(size: bounds.extents)]
    ))

    // Enable manipulation gestures
    buzz.components.set(ManipulationComponent(
        allowedModes: [.move, .rotate, .scale]
    ))

    // Required: mark as interactive
    buzz.components.set(InputTargetComponent())

    return buzz
}

With ManipulationComponent, you do not write gesture recognizer callbacks. RealityKit handles the gesture-to-transform mapping internally, including inertia, boundary constraints, and multi-touch coordination.

Constraining Manipulation

You can limit which axes or ranges a manipulation allows:

@available(iOS 26, *)
@MainActor
func createConstrainedEntity() -> ModelEntity {
    let entity = ModelEntity(
        mesh: .generateBox(size: 0.2),
        materials: [SimpleMaterial(color: .yellow, isMetallic: false)]
    )
    entity.name = "WoodyFigurine"

    entity.components.set(CollisionComponent(
        shapes: [.generateBox(size: SIMD3(repeating: 0.2))]
    ))

    // Only allow rotation and uniform scale
    let manipulation = ManipulationComponent(
        allowedModes: [.rotate, .scale]
    )
    entity.components.set(manipulation)
    entity.components.set(InputTargetComponent())

    return entity
}

Reacting to Manipulation in SwiftUI

Pair ManipulationComponent with SwiftUI gesture modifiers on RealityView to get callbacks during manipulation:

@available(iOS 26, *)
struct InteractiveExhibitView: View {
    @State private var selectedCharacter: String?

    var body: some View {
        RealityView { content in
            let buzz = try? await createManipulableBuzz()
            if let buzz { content.add(buzz) }
        }
        .gesture(
            DragGesture()
                .targetedToAnyEntity()
                .onChanged { value in
                    let component = value.entity
                        .components[PixarCharacterComponent.self]
                    if let component {
                        selectedCharacter = component.characterName
                    }
                }
        )
        .overlay(alignment: .bottom) {
            if let name = selectedCharacter {
                Text("Moving: \(name)")
                    .padding()
                    .background(.ultraThinMaterial, in: Capsule())
            }
        }
    }
}

MeshInstancesComponent: Instanced Rendering

Apple Docs: MeshInstancesComponent — RealityKit

When you need to render hundreds or thousands of identical meshes — a crowd of Toy Story aliens, a field of Coco marigolds, a swarm of Nemo’s clownfish — creating individual ModelEntity instances for each one is prohibitively expensive. MeshInstancesComponent, introduced in iOS 26, uses GPU instancing to render many copies of the same mesh with different transforms in a single draw call:

@available(iOS 26, *)
@MainActor
func createAlienCrowd(count: Int) -> Entity {
    let alienMesh = MeshResource.generateSphere(radius: 0.05)
    let alienMaterial = SimpleMaterial(
        color: .green,
        roughness: 0.5,
        isMetallic: false
    )

    // Generate instance transforms — each alien gets a unique position
    var instances: [MeshInstancesComponent.Instance] = []
    for i in 0..<count {
        let angle = Float(i) / Float(count) * 2.0 * .pi
        let radius: Float = 0.5 + Float.random(in: 0...0.3)
        let position = SIMD3<Float>(
            cos(angle) * radius,
            0.025,
            sin(angle) * radius
        )

        let transform = Transform(
            scale: SIMD3(repeating: Float.random(in: 0.8...1.2)),
            rotation: simd_quatf(
                angle: Float.random(in: 0...(.pi * 2)),
                axis: [0, 1, 0]
            ),
            translation: position
        )

        instances.append(MeshInstancesComponent.Instance(
            id: "alien_\(i)",
            transform: transform.matrix
        ))
    }

    let entity = Entity()
    entity.name = "ToyStoryAlienCrowd"

    // Single mesh + material shared across all instances
    entity.components.set(ModelComponent(
        mesh: alienMesh,
        materials: [alienMaterial]
    ))

    // Instance transforms for GPU-accelerated rendering
    entity.components.set(MeshInstancesComponent(instances: instances))

    return entity
}

With this approach, rendering 1,000 aliens costs roughly the same GPU time as rendering 10 individual entities because the mesh data is uploaded once and the GPU handles per-instance transforms. This is the same technique game engines use for foliage, particle effects, and crowd rendering.

Updating Instances at Runtime

You can update instance transforms each frame to create animations — for example, making the alien crowd bob up and down:

@available(iOS 26, *)
@MainActor
func animateAlienCrowd(entity: Entity, time: Float) {
    guard var instancesComponent = entity
        .components[MeshInstancesComponent.self] else {
        return
    }

    var updatedInstances: [MeshInstancesComponent.Instance] = []

    for instance in instancesComponent.instances {
        var transform = Transform(matrix: instance.transform)

        // Sinusoidal bob animation unique to each instance
        let hash = Float(instance.id.hashValue & 0xFFFF) / Float(0xFFFF)
        let bobOffset = sin(time * 2.0 + hash * .pi * 2.0) * 0.01
        transform.translation.y += bobOffset

        updatedInstances.append(MeshInstancesComponent.Instance(
            id: instance.id,
            transform: transform.matrix
        ))
    }

    instancesComponent.instances = updatedInstances
    entity.components.set(instancesComponent)
}

Environment Occlusion

Environment occlusion makes virtual objects appear to exist behind real-world surfaces. Without it, a virtual Buzz Lightyear placed behind a real table leg would render in front of it, breaking the illusion. RealityKit’s occlusion system uses the AR session’s mesh data to create invisible geometry that blocks rendering of virtual content:

@available(iOS 18, *)
@MainActor
func createOcclusionFloor() -> ModelEntity {
    let floor = ModelEntity(
        mesh: .generatePlane(width: 5.0, depth: 5.0),
        materials: [OcclusionMaterial()]
    )
    floor.name = "OcclusionFloor"
    floor.position = SIMD3(x: 0, y: 0, z: 0)
    return floor
}

OcclusionMaterial renders as invisible but still writes to the depth buffer. Any virtual content behind the occlusion surface is hidden, creating the illusion that real-world objects are in front of the virtual ones.

Scene Reconstruction Occlusion

On devices with LiDAR (iPhone 12 Pro and later, iPad Pro), RealityKit can automatically generate occlusion geometry from the real-world mesh. Enable scene reconstruction on your AR session to get this for free:

@available(iOS 18, *)
@MainActor
func configureSceneReconstruction() -> Entity {
    let anchor = AnchorEntity(
        .plane(
            .horizontal,
            classification: .floor,
            minimumBounds: SIMD2(0.5, 0.5)
        )
    )

    // On LiDAR devices, scene reconstruction provides automatic occlusion.
    // Virtual objects behind real furniture will be hidden automatically.
    let buzz = ModelEntity(
        mesh: .generateSphere(radius: 0.1),
        materials: [SimpleMaterial(color: .purple, isMetallic: true)]
    )
    buzz.name = "BuzzBehindCouch"
    buzz.position = SIMD3(x: 0, y: 0.1, z: -0.5)

    anchor.addChild(buzz)
    return anchor
}

Warning: Scene reconstruction requires LiDAR hardware and is not available on standard iPhone models (non-Pro). Always check ARWorldTrackingConfiguration.supportsSceneReconstruction(.mesh) before enabling it. On non-LiDAR devices, fall back to manual occlusion planes aligned with detected surfaces.

Advanced Usage

Custom Systems

For logic that runs every frame — animation, physics simulation, AI behavior — define a custom System:

struct AlienWanderSystem: System {
    static let query = EntityQuery(
        where: .has(PixarCharacterComponent.self)
    )

    init(scene: Scene) {}

    func update(context: SceneUpdateContext) {
        let deltaTime = Float(context.deltaTime)

        for entity in context.entities(
            matching: Self.query,
            updatingSystemStage: .postPhysics
        ) {
            guard var character = entity
                .components[PixarCharacterComponent.self] else {
                continue
            }

            // Simple wander behavior
            let angle = Float(character.interactionCount) * 0.1 + deltaTime
            let dx = cos(angle) * 0.001
            let dz = sin(angle) * 0.001
            entity.position += SIMD3(dx, 0, dz)

            character.interactionCount += 1
            entity.components.set(character)
        }
    }
}

AlienWanderSystem.registerSystem()

Systems run on the render thread at frame rate (typically 60 or 120 Hz on ProMotion devices). Keep update implementations lightweight — heavy computation should be dispatched to a background thread and the results applied to entities asynchronously.

Physics and Collisions

RealityKit includes a built-in physics engine. Add PhysicsBodyComponent and CollisionComponent to make entities participate in physics simulation:

@available(iOS 18, *)
@MainActor
func createBouncingBlock() -> ModelEntity {
    let block = ModelEntity(
        mesh: .generateBox(size: 0.1, cornerRadius: 0.005),
        materials: [SimpleMaterial(color: .red, isMetallic: false)]
    )
    block.name = "PixarBlock"

    // Physics body — dynamic means it responds to forces and gravity
    block.components.set(PhysicsBodyComponent(
        massProperties: .init(mass: 0.5),
        material: .generate(
            staticFriction: 0.5,
            dynamicFriction: 0.3,
            restitution: 0.7 // Bounciness
        ),
        mode: .dynamic
    ))

    // Collision shape matching the visual mesh
    block.components.set(CollisionComponent(
        shapes: [.generateBox(size: SIMD3(repeating: 0.1))]
    ))

    // Drop from above
    block.position = SIMD3(x: 0, y: 1.0, z: -0.5)

    return block
}

Anchoring to Real-World Surfaces

AnchorEntity ties your content to detected real-world features — horizontal planes (tables, floors), vertical planes (walls), image targets, or face tracking:

@available(iOS 18, *)
@MainActor
func createTableAnchoredExhibit() -> AnchorEntity {
    let anchor = AnchorEntity(
        .plane(
            .horizontal,
            classification: .table,
            minimumBounds: SIMD2(0.3, 0.3)
        )
    )
    anchor.name = "TableExhibitAnchor"

    let toyBox = ModelEntity(
        mesh: .generateBox(
            width: 0.2, height: 0.15, depth: 0.15, cornerRadius: 0.01
        ),
        materials: [SimpleMaterial(
            color: .init(red: 0.6, green: 0.2, blue: 0.1, alpha: 1.0),
            isMetallic: false
        )]
    )
    toyBox.name = "AndysToyBox"
    toyBox.position = SIMD3(x: 0, y: 0.075, z: 0)

    anchor.addChild(toyBox)
    return anchor
}

Performance Considerations

Entity count. Each entity has overhead in the scene graph — transform updates, component lookups, and render submission. For scenes with fewer than 100 entities, this is negligible. Beyond 500 entities, consider MeshInstancesComponent for identical meshes or hierarchical culling to avoid processing off-screen entities.

Mesh complexity. Target 10,000-50,000 polygons per visible entity for iPhone. iPad Pro and Apple Vision Pro can handle 100,000+ per entity. Use LOD (level of detail) techniques — load a high-poly mesh when the camera is close and swap to a low-poly version at distance. Reality Composer Pro can configure LOD levels in your .reality files.

Materials. SimpleMaterial is the fastest material type. PhysicallyBasedMaterial offers realism but at higher GPU cost. ShaderGraphMaterial (from Reality Composer Pro) gives full creative control but requires careful optimization. In our Pixar exhibit app, use SimpleMaterial for background elements and PhysicallyBasedMaterial only for hero characters that demand visual fidelity.

Instanced rendering. MeshInstancesComponent with 1,000 instances is roughly equivalent in GPU cost to 1-5 individual ModelEntity instances. The savings come from reduced draw calls and shared mesh data. Use it liberally for repetitive geometry — crowds, decorations, environmental details.

Physics simulation. Physics runs on the CPU. Each dynamic body adds simulation cost proportional to its collision shape complexity. Use simple collision shapes (boxes, spheres, capsules) rather than mesh-based collision for dynamic bodies. Reserve .generateConvex(from:) and .generateStaticMesh(from:) for static environment geometry.

Profile your AR experience using Instruments with the RealityKit Trace template. It shows entity count, draw call count, GPU frame time, and physics simulation time per frame.

Apple Docs: RealityKit — RealityKit

When to Use (and When Not To)

Scenario	Recommendation
AR experience on iOS with plane detection	Use RealityKit with `RealityView` and `AnchorEntity`.
visionOS spatial experience	Use RealityKit. It is the only supported 3D framework.
2D game or sprite-based rendering	Use SpriteKit or Metal. ECS is overkill for 2D.
Complex shader-heavy 3D without AR	Consider Metal directly for full GPU pipeline control.
Cross-platform 3D (iOS + Android)	Use Unity or a cross-platform engine.
Quick 3D model viewer (no AR)	RealityKit works. Skip `AnchorEntity`.
Thousands of identical objects	Use `MeshInstancesComponent`. Do not create individual entities.
SceneKit migration	Migrate incrementally. Both coexist in the same app.

Summary

RealityKit uses an entity-component-system architecture where Entity is a transform container, Component is a data bag, and System is per-frame logic. This architecture is shared across iOS, iPadOS, macOS, and visionOS.
RealityView is the SwiftUI container for 3D content. Use its make closure for initial setup and update closure to synchronize SwiftUI state with entity properties.
ManipulationComponent (iOS 26) provides declarative drag, rotate, and scale gesture handling — eliminating manual gesture recognizer wiring.
MeshInstancesComponent (iOS 26) enables GPU-instanced rendering of thousands of identical meshes with per-instance transforms, at a fraction of the cost of individual entities.
OcclusionMaterial and scene reconstruction (LiDAR) enable virtual objects to appear behind real-world surfaces, which is essential for convincing AR.
Custom Component types attach app-specific data to entities, and custom System types run per-frame logic at render rate.
Profile with the RealityKit Trace template in Instruments. Target under 100 entities for simple scenes; use instancing for large crowds.

For the companion ARKit integration guide — plane detection, world tracking configuration, and shared world anchors — see ARKit: Your First AR App. To build spatial content with visual tools rather than code, explore Reality Composer Pro.