Skip to main content

Before you Begin

Complete the Python Client Setup before continuing. You’ll need Python 3.10+ and your API key configured. For information on environment variables, see Environment Variables.

1. Create and Run a Lens Session

1. Setup

If you haven’t already done so, see these instructions to install the Archetype AI python client or directly run the following in your terminal:
pip install archetypeai
export ATAI_API_KEY="insert_your_key"
Note: Replace "insert_your_key" with your actual API key from Archetype AI.

2. Create and Run a Lens Session

Next, you will create a Python example that defines and runs a custom Lens on a pre-recorded video. A Lens controls how data is ingested, interpreted, and returned by Newton. Default parameters are provided and can be adjusted later. The parameters within the lens configuration can be edited to customize the analysis for your specific needs. Open the following code snippet in your text editor to get started:
import logging
from archetypeai import ArchetypeAI, ArgParser

def main(args):
    # Create a new client using your unique API key.
    client = ArchetypeAI(args.api_key, api_endpoint=args.api_endpoint)

    # Upload the video file to the archetype platform.
    file_response = client.files.local.upload(args.filename)

    # Create a custom lens and automatically launch the lens session.
    client.lens.create_and_run_lens(f"""
       lens_name: Custom Activity Monitor
       lens_config:
        model_parameters:
            model_version: Newton::c2_4_7b_251215a172f6d7
            instruction: {args.instruction}
            temporal_focus: 5
            max_new_tokens: 256
        input_streams:
            - stream_type: video_file_reader
              stream_config:
                file_id: {file_response['file_id']}
        output_streams:
            - stream_type: server_sent_events_writer
    """, session_callback, client=client, args=args)

def session_callback(
        session_id: str,
        session_endpoint: str,
        client: ArchetypeAI,
        args: dict
    ) -> None:
    """Main function to run the logic of a custom lens session."""

    # Create a SSE reader to read the output of the lens.
    sse_reader = client.lens.sessions.create_sse_consumer(
        session_id, max_read_time_sec=args.max_run_time_sec)

    # Read events from the SSE stream until either the last message is
    # received or the max read time has been reached.
    for event in sse_reader.read(block=True):
        logging.info(f"[sse_reader] {event}")

    # Close any active reader.
    sse_reader.close()

if __name__ == "__main__":
    parser = ArgParser()
    parser.add_argument("--filename", required=True, type=str)
    parser.add_argument("--instruction", default="Describe the actions in the video.", type=str)
    parser.add_argument("--max_run_time_sec", default=10.0, type=float)
    args = parser.parse_args(configure_logging=True)

    # Validate the input.
    assert args.filename.endswith(".mp4"), "Enter an .mp4 video file"

    main(args)
To run this example, save the code above as example.py, then run the following in your terminal:
python example.py --filename=your_test_video.mp4
Note: Replace your_test_video.mp4 with the actual name of your video file. Download our sample delivery video to follow along:
curl -L -o delivery.mp4 "https://drive.usercontent.google.com/download?id=1Asy5D47WiDF0mGgP_3krS_GhZFS_c0mv&confirm=t"

2. What you should see

After running the command, you should see the description of the video stream live in your terminal. Here’s an example response: 'response': ['The video shows a FedEx Ground delivery truck parked on a residential driveway. A delivery person exits the truck carrying a package. The environment appears to be a suburban neighborhood with well-maintained lawns and a clear sky.']
Lenses created via the API will appear in your Workbench where you can monitor sessions, view results, and manage your lens configurations.

Try Variations

Using the same video (or a starter video), experiment with how Lens parameters affect the output:
  • Edit Instruction - Modify the --instruction argument to change what the model analyzes:
python example.py --filename=delivery.mp4
Example instruction:
You are an expert perception agent analyzing activity in front 
of a house.  
Based only on observable movement, trajectory, speed, and 
interactions with objects, tell me:  
Current intention (what the person appears to be trying to do),  
Movement direction & destination Active behaviors (e.g., 
approaching, waiting, inspecting, departing),  
Confidence level (low / medium / high). Anything that's 
low confidence, please report 'no significant activity'.  
Respond in 20 words or less
  • Adjust Temporal Focus - The temporal focus controls how large of a time window the Newton model analyzes per inference. Change temporal_focus in the code from 5 seconds to 20 seconds to compare shorter vs. longer time windows for analysis.
For advanced parameters, see the Parameter Overview.

Troubleshooting

If you’re getting errors, see the Troubleshooting page for common issues and solutions, or check the Python Client documentation for installation and upgrade instructions.

Learn More