Skip to main content

Prerequisites

The instructions below assume you are following best practices and have downloaded this git repo into a parent directory at: ~/atai. If you have already installed the ATAI Python Library and cloned the cookbook repository, proceed directly to Quickstart.

Install Conda

wget https://repo.anaconda.com/archive/Anaconda3-2022.05-Linux-x86_64.sh
bash Anaconda3-2022.05-Linux-x86_64.sh
source ~/.bashrc

Setup dev environment

conda create -n dev_env python=3.10
conda activate dev_env

Install ATAI Python Library

git clone [email protected]:archetypeai/python-client.git
cd python-client
python -m pip install .

Clone Cookbook Repository

git clone https://github.com/archetypeai/archetypeai-cookbook.git

Quick Start

Analyze video content through natural language queries using the interactive CLI.

Running the Demo

From the Cookbook root directory with your conda environment activated:
cd command-line-demos/activity-monitor
python quickstart.py
Sample videos are available in the sample_videos/ directory for testing the demo.

Interactive Prompts

  1. API Key: Your ArchetypeAI API key
  2. Input Type: Choose video (local file) or rtsp (camera stream)
  3. Source:
    • For video: Path to video file (drag & drop supported)
    • For RTSP: Camera stream URL
  4. Focus: Your question about the video (e.g., “Is there a person?”, “What’s happening?”)

Example Session

=== Activity Monitor Quickstart ===

Enter your ArchetypeAI API key: your-key-here

Input type (video/rtsp): video
Enter path to video file: /path/to/security_footage.mp4

What should the monitor look for? (e.g., 'person entering', 'vehicle'): Is there anyone at the door?

--- Configuration Summary ---
Input: VIDEO
Video file: /path/to/security_footage.mp4
Focus: Is there anyone at the door?

Press Enter to start monitoring...

🔍 Monitoring started - Looking for: 'Is there anyone at the door?'

Response: No, the door area is empty. I can see a driveway and front yard but no person present.
Response: Yes, there is a person approaching the front door carrying a package.

Output

The system provides natural language responses to your questions about the video content. Responses update as the video progresses (for files) or continuously (for RTSP streams).

Step Size Configuration (in review)

The step size defines the interval between analyzed frames in each inference cycle. A larger step size means fewer frames are examined, which can reduce detail but speed up processing. A smaller step size increases context for answering queries, but may require more computation.
  • Default: 60 - Optimal for most scenarios
  • Short videos: 30 - Ensures sufficient granularity for brief clips
  • Long-form content: 90-120 - Captures broader context and patterns