Skip to main content

Supervised Fine Tuning

Supervised Fine Tuning (SFT) lets you fine-tune Newton using curated examples of your custom data or use case. This process can help Newton adapt to your unique sensor configuration or transfer domain knowledge to Newton to unlock your specific use case. The result of fine-tuning Newton is a new instance of Newton that can be identified with a specific model_id. The fine-tuned version of Newton is encrypted and access-controlled with visibility to your organization only.

Labeled Examples

SFT relies on high-quality examples to help steer a pre-trained instance of Newton to a new sensor configuration or use case. Each labeled example consists of an input and output pair in JSON or YAML format. This provides Newton the information required to understand that given input X, it should generate output Y. As Newton is a multimodal modal, here the input (X) can consist of one-or-more sensor types, with the output (Y) currently constrained to text output. Some examples of X, Y pairs are:

Sensor Log Example

The following example shows a fine-tuning instance that maps a short sensor log to a text output. In this case, motion sensor readings from a retail store are converted into a simple report describing a customer’s movement through the space The input data used in this example is a data.text event containing a list of sensor logs. The output data used in this example is a data.text event containing the desired output response of the model.
lens_parameters:
	instruction: |
		You are a retail monitoring system. Monitor the motion of customers throughout a store based on the presence sensors.
		Output a report on the customers activity based on the logs in the format: [time start - time end] {activity}.
inputs:
	- type: data.text
	  event_data:
		contents: |
			- sensor_name: Entrance
 			  log:
      				- presence_state: "active"
        			  timestamp: "2025-08-18T12:25:00"
      				- presence_state: "inactive"
        			   timestamp: "2025-08-18T12:25:01"
  			- sensor_name: Aisle1
    		  log:
      				- presence_state: "inactive"
        			timestamp: "2025-08-18T12:25:00"
  			...
outputs:
	- type: data.text
	  event_data:
		contents: |
			Here is a summarized report of customer activity based on the given sensor logs.
			* [12:25:00 - 12:25:01] Customer at Entrance
			* [12:25:01 - 12:25:03] Customer in Aisle1
			* [12:25:03 - 12:25:06] Customer in Aisle2
			* [12:25:06 - 12:25:09] Customer in Aisle3
			* [12:25:09 - 12:25:10] Customer at Checkout
			* [12:25:10 - onward] Customer at Exit

Base64 Image Example

The example below shows a fine-tuning instance that maps a base64-encoded image to a text output. In this case, the model generates traffic alerts, with a specific rule to issue an alert whenever a red pickup truck is detected. The input data used in this example is a single base64 encoded image as a data.base64_img event. The output data used in this example is a data.text event containing the desired output response of the model.
lens_parameters:
	instruction: |
		You are an intelligent traffic monitoring system. Monitor the traffic feed for the following activity or object.
		Output alerts in the format: [alert]{activity or object name}: {description of activity or object}
		Context: this camera is on 104th and main street looking west-to-east from left-to-right.
       	Focus: All red pickup trucks driving east.
inputs:
	- type: data.base64_img
	  event_data:
		contents: iVBORw0KGgoAAAANSUh...
outputs:
	- type: data.text
	  event_data:
		contents: "[alert]red pickup truck: A red pickup truck is driving on 104th and main heading east."
This example would create a training example that is formatted as the following: Newton Labeling Example