# Reference

The Apache Spark website is the best reference for getting started with programming, deploying and running Spark applications

<https://spark.apache.org/docs/latest/index.html>

**Programming Guides:**

* [Quick Start](https://spark.apache.org/docs/latest/quick-start.html): a quick introduction to the Spark API; start here!
* [RDD Programming Guide](https://spark.apache.org/docs/latest/rdd-programming-guide.html): overview of Spark basics - RDDs (core but old API), accumulators, and broadcast variables
* [Spark SQL, Datasets, and DataFrames](https://spark.apache.org/docs/latest/sql-programming-guide.html): processing structured data with relational queries (newer API than RDDs)
* [Structured Streaming](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html): processing structured data streams with relation queries (using Datasets and DataFrames, newer API than DStreams)
* [MLlib](https://spark.apache.org/docs/latest/ml-guide.html): applying machine learning algorithms
* [GraphX](https://spark.apache.org/docs/latest/graphx-programming-guide.html): processing graphs
* [PySpark](https://spark.apache.org/docs/latest/api/python/getting_started/index.html): processing data with Spark in Python

**API Docs:**

* [Spark Scala API (Scaladoc)](https://spark.apache.org/docs/latest/api/scala/org/apache/spark/index.html)
* [Spark Java API (Javadoc)](https://spark.apache.org/docs/latest/api/java/index.html)

**Operations Guide:**

* [Configuration](https://spark.apache.org/docs/latest/configuration.html): customize Spark via its configuration system
* [Monitoring](https://spark.apache.org/docs/latest/monitoring.html): track the behavior of your applications
* [Tuning Guide](https://spark.apache.org/docs/latest/tuning.html): best practices to optimize performance and memory use
* [Job Scheduling](https://spark.apache.org/docs/latest/job-scheduling.html): scheduling resources across and within Spark applications


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.invariant.io/data-science/apache-spark/reference.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
