idea关联scala与spark开发（全）

前面Windows下已经安装好本地2.11.11版本的scala、spark-2.4.3-bin-hadoop2.7版本的spark、hadoop2.7.7本地版本了，没安装的先去安装一下先 scala的Windows本地安装 ; spark的Windows本地安装;hadoop的Windows本地安装

1.idea上安装scala插件

按照箭头指示操作

装好之后重启idea

2. 添加scala框架

创建项目：文件->新建->项目->名称和位置，java，maven->创建

添加scala框架支持:右键->添加框架支持->下拉找到scala，点击并确定

3. 创建scala案例运行测试

在main和test文件夹下建立scala文件夹

将main目录下的scala目录标记为源代码根目录

新建scala类，编写案例进行测试

4. 添加spark依赖包，运行spark案例

添加依赖包：文件->项目结构->

找到你安装本地spark目录下的jars包文件，点击确定，添加进去

之后你会看见这里多了jars目录，这是运行需要的库

创建test2运行spark程序案例并运行：

代码：

import org.apache.spark.{SparkConf, SparkContext}  object test2 {  def main(args: Array[String]): Unit = {  val conf = new SparkConf().setAppName("WordCount").setMaster("local[2]")  val sc: SparkContext = new SparkContext(conf)  val line = sc.textFile("F:\\test.txt")  val word = line.flatMap(_.split(" "))  val tup  = word.map((_,1))  val reduced = tup.reduceByKey(_+_)  val res = reduced.sortBy(_._2,false)  println(res.collect.toBuffer)  res.saveAsTextFile("./TestWord")  sc.stop()  }
}

txt文件：
hello hello world scala java Python
java hello c++ c kafka flume hadoop sqoop
supervisor redis hive hive hbase hbase zookeeper hive hdfs hdfs hdfs
大数据大数据大数据程序员

运行结果：

至此，已做好环境准备。

另外，如果关于需要配置pom.xml，提供以下参考文件，对应版本修改一下就好了：

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">  <modelVersion>4.0.0</modelVersion>  <!--设置自己的groupID-->  <groupId>org.example</groupId>  <artifactId>sparkDemo</artifactId>  <version>1.0-SNAPSHOT</version>  <!--设置依赖版本号-->  <properties>  <scala.version>2.11.12</scala.version>  <hadoop.version>2.7.3</hadoop.version>  <spark.version>2.4.0</spark.version>  </properties>  <dependencies>        <!--Scala-->  <dependency>  <groupId>org.scala-lang</groupId>  <artifactId>scala-library</artifactId>  <version>${scala.version}</version>  </dependency>        <!--Spark-->  <dependency>  <groupId>org.apache.spark</groupId>  <artifactId>spark-core_2.11</artifactId>  <version>${spark.version}</version>  </dependency>        <dependency>            <groupId>org.apache.spark</groupId>  <artifactId>spark-sql_2.11</artifactId>  <version>${spark.version}</version>  </dependency>        <dependency>            <groupId>mysql</groupId>  <artifactId>mysql-connector-java</artifactId>  <version>5.1.47</version>  </dependency>        <!--Hadoop-->  <dependency>  <groupId>org.apache.hadoop</groupId>  <artifactId>hadoop-client</artifactId>  <version>${hadoop.version}</version>  </dependency>  <!--  https://mvnrepository.com/artifact/com.google.code.gson/gson  <dependency>             <groupId>com.google.code.gson</groupId>             <artifactId>gson</artifactId>             <version>2.8.0</version>         </dependency>  &lt;!&ndash; https://mvnrepository.com/artifact/org.apache.kafka/kafka &ndash;&gt;         <dependency>             <groupId>org.apache.kafka</groupId>             <artifactId>kafka_2.11</artifactId>             <version>1.0.0</version>         </dependency>-->  <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib -->        <dependency>  <groupId>org.apache.spark</groupId>  <artifactId>spark-mllib_2.11</artifactId>  <version>${spark.version}</version>  </dependency>    </dependencies>  <build>        <sourceDirectory>src/main/scala</sourceDirectory>  <testSourceDirectory>src/test/scala</testSourceDirectory>  <plugins>            <plugin>                <groupId>net.alchim31.maven</groupId>  <artifactId>scala-maven-plugin</artifactId>  <version>3.2.2</version>  <executions>                    <execution>                        <goals>                            <goal>compile</goal>  <goal>testCompile</goal>  </goals>                        <configuration>                            <args>                                <arg>-dependencyfile</arg>  <arg>${project.build.directory}/.scala_dependencies</arg>  </args>                        </configuration>                    </execution>                </executions>            </plugin>  <plugin>                <groupId>org.apache.maven.plugins</groupId>  <artifactId>maven-shade-plugin</artifactId>  <version>2.4.3</version>  <executions>                    <execution>                        <phase>package</phase>  <goals>                            <goal>shade</goal>  </goals>                        <configuration>                            <filters>                                <filter>                                    <artifact>*:*</artifact>  <excludes>                                        <exclude>META-INF/*.SF</exclude>  <exclude>META-INF/*.DSA</exclude>  <exclude>META-INF/*.RSA</exclude>  </excludes>                                </filter>                            </filters>                            <transformers>                                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">  </transformer>                            </transformers>                        </configuration>                    </execution>                </executions>            </plugin>        </plugins>    </build></project>