DATA 228
Big Data Technologies and Applications (Fall 2024)
Sangjin Lee
Just Enough Java(TM)
Why Java?
• J v is the best w y to le rn H doop nd M pReduce
• You will underst nd H doop fund ment ls better with J v
• J v is (still) one of the most popul r server-side l ngu ges
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
What we will cover
• Assuming little prior knowledge
• Just enough to get you going nd st rting writing M pReduce pplic tions
• F mili rity with fund ment ls of the J v progr mming l ngu ge
• F mili rity with the b sic toolset
• Pretty line r progr mming in J v
• Stop me nd sk questions!
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
What we will not cover
• More dv nced l ngu ge fe tures
• Multi-thre ded progr mming
• closures
• re ctive progr mming
• modules
• etc. etc.
• Use of IDEs
a
a
a
a
a
a
a
a
a
Java programming language
• J v != J v script
• J v comes from the line ge of C/C++
• Invented by J mes Gosling t Sun Microsystems in 1995
• Currently #2 in l ngu ge popul rity (https://pypl.github.io/PYPL.html)
a
a
a
a
a
a
a
a
a
a
a
a
Java programming language
• B sic types: int, boolean, long, char, double, float, byte, short
• String is irst-cl ss citizen
• J v uses curly br ces (“{“ nd “}”) nd semi-colon (“;”) to deline te
• J v comments
• Double sl shes (“//“)
• Enclosing sl sh-dots (“/*” nd “*/“)
• Speci l J v doc comments (“/**” nd “*/“)
a
a
a
a
a
a
a
a
a
a
a
f
a
a
a
a
a
a
a
a
Java programming language
• V ri ble decl r tion
• int someNumber = 1;
• V ri ble decl r tions must lw ys decl re types: no type inference
• Method sign ture
• private boolean isOK(String input) {
• Access qu li ic tion vi public, private, protected ( nd def ult)
• Use import to introduce v ri bles, etc. from other code
a
a
a
a
a
a
f
a
a
a
a
a
a
a
a
a
a
a
a
a
Java programming language
Arrays and Collections
• Arr y: n indexed rr y of things
• int[] numbers = {1, 2, 3, 4, 5};
• numbers[2] = ?
• Collections: v rious collection types of things
• List: List<Integer> numbers = new LinkedList<Integer>();
• Map: Map<String,String> dictionary = new HashMap<String,String>();
• Set: Set<Integer> numbers = new HashSet<Integer>();
a
a
a
a
a
Java programming language
Generics
• Generics: dds type inform tion to methods nd cl sses
public interface Comparator<T> {
....
int compare(T o1, T o2);
....
}
a
a
a
a
Java programming language
• J v is n object-oriented progr mming (OOP) l ngu ge
• Everything is class
• There re no glob l v ri bles
• Inherit nce nd polymorphism
• Interfaces
• J v is compiled l ngu ge (vs. interpreted)*
* The distinction is not s cle r s one m y think.
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
Java programming language
Class
• A templ te for object inst nces
• Member v ri bles
• Inst nce member v ri bles
• St tic member v ri bles & (st tic) const nts
• Methods
• Inst nce methods
• Constructors
• Abstr ct methods
• St tic methods
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
Java programming language
Class
public class Student { // class
private static int studentCount = 0; // static member
private long id; / instance member /
private String name; // instance member
public Student(String name, long id) { // constructor
this.name = name;
this.id = id;
studentCount++; // increment when we create a new student
}
public String getName() { // instance method
return name;
}
public abstract String getDorm(); // abstract method
}
*
*
Java programming language
Creating an instance of a class
• C ll one of its constructors
• Student someStudent = new Student(“Jimmy Dean”, 5983);
• Def ult constructor
• Constructor with no rguments
• All inst nce members re initi lized to their def ult v lues (0, null, …)
• Foo foo = new Foo();
• Implicitly cre ted if there re no other constructors
a
a
a
a
a
a
a
a
a
a
Java programming language
Referring to member variables and methods
• Inst nce member v ri bles nd methods (someStudent s n inst nce of the Student
cl ss)
• someStudent.id
• someStudent.getName()
• St tic member v ri bles nd methods (Student cl ss): kind of like glob l
• Student.studentCount
• Student.someStaticMethod()
a
a
a
a
a
a
a
a
a
a
a
a
a
a
Java programming language
• J v h ndles errors vi Exceptions
• try-c tch idiom
a
a
a
a
a
Java programming language
Exceptions
public class Callee {
....
public void foo() throws IOException {
....
}
....
}
public class Caller {
private boolean happy() {
Callee c = ....;
try {
c.foo();
} catch (IOException e) {
// handle this exception
}
return something;
}
}
Java programming language
• Strings
• Unicode
• Immut ble
• Iter tion
• for-e ch
• while
• do-while
• Supporting iter tion: rr ys, most collections, nd Iter tor
a
a
a
a
a
a
a
a
Java programming language
Iteration via Iterator
public class Foo {
....
public void foo(Iterator<String> it) throws IOException {
while (it.hasNext()) {
String s = it.next(); // advances the cursor to the next element in iteration
System.out.println(“string: “ + s);
}
}
....
}
Java programming language
Iteration via Iterator
Java programming language
Iteration via Iterator
public class Foo {
....
public void foo(Iterator<String> it) throws IOException {
for (String s : it) {
System.out.println(“string: ” + s);
}
}
....
}
Java programming language
• Inherit nce is done vi extends nd implements
• B se cl ss —> extends
• Interf ce —> implements
• Cont ins nothing but method sign tures*
• Denotes type: “A Foo is ny cl ss th t implements ll these methods”
• A cl ss th t implements n interface must provide ll method implement tions
* Not lw ys true s J v introduced def ult methods for interf ces.
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
Java programming language
Inheriting classes
public class BaseClass {
....
public void foo() throws IOException {
....
}
public int bar() {
....
}
}
public class ChildClass extends BaseClass {
// you may choose to “override” base class methods
public int bar() {
....
}
}
Java programming language
Interfaces
public interface Runnable { // does not use the Class qualifier
public void run(); // (usually) no implementation
}
public class MyRunnable implements Runnable { // you MUST declare implements
// you MUST implement all interface methods
public void run() {
.... // my implementation of run()
}
}
Java programming language
• J v h s g rb ge collector
• No need to do memory m n gement: no pointers (C/C++), no free/delete (C/C++), no memory
ownership (Rust)
• G rb ge collector: JVM runtime component th t observes object us ge nd removes
unused objects from memory utom tic lly
• Implic tions of h ving g rb ge collector
• E se of progr mming
• Unpredict bility: you lose precise control over the progr m
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
Java programming language
Compile-time vs. runtime
• At compile-time, code is compiled into bytecode
• Bytecode is m chine-independent “execut ble” th t gets gener ted by the compiler
• .class iles or j r iles (collections of .class iles)
• “Write once, run nywhere”
• At runtime, bytecode is re d nd compiled into ctu l bin ries by the J v Virtu l M chine (JVM)
• These bin ries re not visible to you
• JVM continues to regener te nd optimize the bin ries (just-in-time compiler)
f
a
a
a
a
a
a
f
a
a
a
a
a
f
a
a
a
a
a
a
a
a
a
a
Java programming language
Compile-time vs. runtime
Compile-time Runtime
JVM
JIT N tive
Code Compiler Bytecode
Compiler bin ry
.java javac .class
or jar
OS/H rdw re
a
a
a
a
Java programming language
Dependencies and classpath
• How do you use other people’s code (libr ries)? —> dependencies
• Cl ssp th: speci ies how you use others’ libr ries vi p th
• Cl ssp th m tters for compile-time nd runtime
• J v comes with ton of built-in libr ries (APIs): https://docs.or cle.com/en/j v /j v se/21/
docs/ pi/index.html
• For this cl ss, J v APIs nd H doop APIs should be su icient
a
a
a
a
a
a
a
a
a
a
a
a
f
a
a
a
a
a
a
a
ff
a
a
a
a
a
a
Java programming language
Toolset
• JDK vs. JRE
• JDK (J v Development Kit)
• Includes everything for developing, building, nd running J v progr ms
• Superset of JRE
• JRE (J v Runtime Environment)
• Only includes things to st rt JVMs nd run J v progr ms
• Recommend OpenJDK (https://openjdk.org/)
a
a
a
a
a
a
a
a
a
a
a
a
a
Java programming language
Toolset
• Build tools
• Choices: m ven, gr dle, B zel, …
• We’re going to use m ven for this cl ss
• H doop uses m ven to build nd specify dependencies
• Most j v libr ries in the world re discover ble nd v il ble through the m ven
ecosystem
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
Java programming language
Demo
Work tow rds“Hello World”
J v /m ven pp
a
a
a
a
a
Java programming language
Demo
• (Inst ll JDK nd m ven)
• Cre te simple m ven project
• Explore the m ven project
• Write m in pp th t prints the current loc le
• Build j r
• Run the progr m in the j r
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
Java programming language
Demo
M ven “coordin tes”
edu.sjsu.data228.sjlee:helloworld 1.0 SNAPSHOT.jar
groupId rtif ctId (sem ntic) version
a
a
a
a
a
-
-