Using AWS SageMaker Endpoint

From DeepSense Docs
Jump to: navigation, search

Amazon SageMaker is a managed ML service that helps you build and train models and then deploy them into a production-ready hosted environment.

What is SageMaker Endpoint?

Amazon SageMaker is a fully managed machine learning service that enables data scientists and developers to build, train, and deploy machine learning models at scale. One of the key features of SageMaker is the ability to deploy machine learning models as endpoints, which can be invoked to make predictions on new data. Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements. You can deploy your model to SageMaker hosting services and get an endpoint that can be used for inference. These endpoints are fully managed and support autoscaling.

Prerequisite for setting up SageMaker Endpoint (Model Artifact)

  • After you are done with the modeling phase you can store the trained model on your project allocated s3 bucket.
  • Open the SageMaker console and open "Inference" on left panel and click on “Create model".
  • Give your model a name and choose the appropriate container image for your model.
  • Specify the location of your trained model artifacts (e.g., S3 bucket).
  • Specify any additional configuration settings (e.g., instance type, number of instances).

Setting up a SageMaker Endpoint

  • Open the SageMaker console and open "Inference" on left panel and click on “Create endpoint”.
  • Give your endpoint a name and choose the appropriate configuration settings (e.g., instance type, number of instances).
  • Select the model you created in the previous step.
  • Click on “Create endpoint”. Your SageMaker endpoint is now ready to be invoked.

Invoking the endpoint

  • For invoking the endpoint use your notebook to
  • Use this code in your notebook to invoke an endpoint.
  • Invoke the endpoint using the `invoke_endpoint` method of the SageMaker runtime client object.

response = runtime.invoke_endpoint(EndpointName='<your-endpoint-name>', ContentType='application/json', Body=json.dumps(input_data))

  • You can parse the input and output as per your convenience.

Remember: It's crucial to delete or scale down your endpoint when not actively in use to avoid unnecessary charges.