
Working with AI models demands efficient and reliable dataset management. Yet, developers often face friction when downloading, organizing, and maintaining large datasets — especially when juggling complex data structures, multiple versions, and different environment setups. Manually handling this process can be time-consuming and prone to error.
In previous releases, we introduced the ability to generate versioned datasets with structured splits for training, validation, and testing. Now, to further simplify dataset management and bring versioned data closer to your development workflow, we're excited to introduce the Ocular AI SDK — a developer-friendly toolkit to make accessing, syncing, and managing datasets seamless.
SDK Features: Initial Release
- Simple download URLs for quick access to dataset version exports
- API key authentication for secure and seamless integration
- Built-in retry logic with download progress tracking for reliable, uninterrupted transfers
Developer-First Approach
- Clean and intuitive API design
- Comprehensive error handling
- Detailed logging
- Environment variable configurations
Steps to Integrate Datasets in Your Work Environment
Step 1: Export Your Dataset
Navigate to the versions page and select a version. Click the "Create Export" button to generate an export in your required format.

Step 2: Choose Your Access Method
Once an export is generated in your desired format, you'll get a popup with the SDK code snippet.
We provide multiple options to access the dataset:
- SDK snippet: For seamless integration with Jupyter notebooks and Python workflows
- CURL command: For CLI-based downloads via terminal or automation scripts
- Public download URL: For direct browser downloads or sharing with collaborators

Note: You can check out how to get your Ocular API key in this blog.
Termninal Output
When executing SDK operations, the client provides detailed runtime information through structured logging. The example below demonstrates the complete lifecycle of a dataset export operation:
123452025-04-04 17:00:27,961 - ocular - INFO - Accessing workspace: 712368e2-af67-4de6-bc35-5367793f9b09 2025-04-04 17:00:29,907 - ocular - INFO - Retrieving project 41fa3ff1-811c-4c67-82a5-f82bc376e284 from workspace 712368e2-af67-4de6-bc35-5367793f9b09 2025-04-04 17:00:38,632 - ocular - INFO - Downloading export 122fc4fe-45f1-4427-bd4a-bfae0f8bc2c1 Downloading: 12.0 MB downloaded (3.2 MB/s) 2025-04-04 17:00:46,166 - ocular - INFO - Downloaded export to /content/datasets/export_122fc4fe-45f1-4427-bd4a-bfae0f8bc2c1.zip
The log output follows a consistent pattern:
- Timestamp: ISO 8601-compatible timestamps for operation sequencing
- Component: Module identifier (
ocular
) - Log level: Indicates message severity (
INFO
,DEBUG
, etc.) - Operation details: Contextual information including UUIDs and paths
- Transfer metrics: Real-time throughput and completion statistics
This structured output facilitates integration with log aggregation systems for monitoring and debugging complex workflows.
Client Configuration
The SDK is designed with configurable parameters to accommodate various network conditions and security requirements. The client can be initialized with custom settings to optimize performance based on your infrastructure:
1234567ocular = Ocular( api_key="your_api_key", # Required api_url="https://api.useocular.com", # Optional timeout=300, # Request timeout in seconds max_retries=3, # Maximum number of retries backoff_factor=0.5, # Backoff factor for retries )
These parameters allow you to fine-tune network behavior, particularly useful when working with unstable connections or when transferring datasets across high-latency networks. The exponential backoff strategy implements industry-standard retry patterns to maximize transfer reliability.
Logging Configuration
The SDK implements a configurable logging system built on Python's standard library that can be tailored to different development environments and operational requirements:
123456from ocular.utils.logging import setup_logging # Setup logging with custom settings logger = setup_logging( level="DEBUG", )
Use cases for the logging:
- If a user wants more detailed logging on the SDK, they can set the level to
DEBUG
. - If a user wants only important logs like progress, workspace, and project information, they can set the logging level to
INFO
. - The default level is
INFO
which logs timestamps and important warnings and failures.
SDK Upcoming Support
We are actively working on expanding the SDK's capabilities. Upcoming releases will support:
- Ingesting data from cloud integrations
- Data pre-processing pipelines
- Model training, customization, deployment, and fine-tuning
Get Involved
- Join our Slack community
- Follow Launch Week II updates