Analysis of Software Engineering Industry Needs An

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/318582283

Analysis of software engineering industry needs and trends: Implications for


education

Article  in  International Journal of Engineering Education · January 2017

CITATIONS READS

11 2,754

2 authors:

Fatih Gurcan Cemal Köse


Karadeniz Technical University Karadeniz Technical University
24 PUBLICATIONS   62 CITATIONS    119 PUBLICATIONS   640 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Vessel Segmentation View project

Cloud Based Retinal Analysis System View project

All content following this page was uploaded by Fatih Gurcan on 31 January 2019.

The user has requested enhancement of the downloaded file.


International Journal of Engineering Education Vol. 33, No. 4, pp. 1361–1368, 2017 0949-149X/91 $3.00+0.00
Printed in Great Britain # 2017 TEMPUS Publications.

Analysis of Software Engineering Industry Needs and


Trends: Implications for Education*
FATIH GURCAN and CEMAL KOSE
Karadeniz Technical University, Department of Computer Engineering, Faculty of Engineering, 61080, Trabzon, Turkey.
E-mail: fgurcan@ktu.edu.tr, ckose@ktu.edu.tr

In modern day software development environments, analysis and understanding of the emerging industry needs is of
strategic importance for a more effective software engineering (SE) education that is innovative and responsive to changing
industry needs. Considering the demand for well-trained software engineers in the near future, an empirical study was
performed on SE job postings in order to identify the emerging needs and trends in the software industry. The methodology
of this study was based on semantic topic analysis implemented by latent Dirichlet allocation (LDA), a probabilistic
generative approach for topic modeling. The findings of the study indicated that, the software industry has a wide spectrum
in terms of professional roles, responsibilities (in-demand skills) and combinations of programming languages. Each of the
professional roles is profoundly based on specific skill sets that reflect the dynamics of the software industry. Also, the
topics discovered by LDA highlighted a broad range of the characteristics of the SE, such as contemporary trends,
demands, skills, tools, platforms, methodologies, and technologies that indicate the level of progress in this dynamic field.
In light of these findings, an innovative academic curriculum for SE education can be designed consistent with the emerging
needs and trends in the software industry. In this regard, the findings can provide valuable implications for the industry,
academia, and SE community to close the gap between the industry needs and the current SE education.
Keywords: software engineering education; software engineer skills; software industry needs; topic modeling; latent Dirichlet allocation

1. Introduction nities are opening up every day for the software


engineers. Online employment platforms (web sites)
The 21st century has a significant progress in the are intensively used by employees and employers in
information technology (IT). Therefore, the 21st order to provide the interactions between them [5].
century is defined as the information age. Consistent The volume and variety of shared information in
with the developments in the IT, it has experienced a these platforms are ever-increasing in recent times
major evolution in software development technol- due to this intensive usage.
ogies. At the present time, the wide spectrum of Numerous SE jobs are published every day on the
software applications is used effectively in every platforms. From this perspective, the SE job post-
phase of daily life. From this perspective, software ings can be seen as an indicator of the industry needs
engineering (SE) plays a crucial role in this lifecycle and trends in this field [5–7]. Therefore, the studies
as an engineering discipline based on the application based on analysis of the SE jobs and determination
of engineering practices to software development of the needs and trends may provide valuable
process [1–3]. In the new world order led by the IT, contributions for the engineers, instructors, and
understanding of emerging needs and trends in companies in the SE field.
dynamic software industry is a strategically key Given this background, numerous studies were
factor for the SE discipline in order to keep pace performed for the determination of professional
with industrial modernizations [3, 4]. The software qualifications required for IT workforce by analyz-
industry has the dynamic, entrepreneurial, and ing online job postings [5–8]. The studies based on
collaborative working environments in which all analysis of the job postings can reveal the up-to-date
processes are based on the cognitive labor force, industry needs and trends related to the SE field.
and so human resources are used effectively [2, 3]. In Given this background, the education of software
these working environments, as the leadings actors, engineers consistent with industrial needs and
software engineers are expected to have a wide trends is seen as an essential open question in
spectrum of roles, responsibilities, and skills fre- terms of future of the SE discipline [2]. In this
quently changing. Besides, contemporary software regard, the gap between software industry needs
development process requires to use of the different and academic preparation were discussed in a
combinations of programming languages [5]. For number of studies [2–9], and various approaches
this reason, the software engineers should always were proposed in order to close this gap by devel-
keep their knowledge and skills up to date [6]. As the oping new learning models [10] and methodologies
software industry advances, new career opportu- [11], investigating the industry needs and trends

* Accepted 10 February 2017. 1361


1362 Fatih Gurcan and Cemal Kose

[12, 13] analyzing educational requirements for istics of the SE, such as contemporary trends,
software engineers [14], determining professional demands, skills, tools, platforms, and technologies
roles and responsibilities (in-demand skills) for soft- that reflect the level of progress in this dynamic field.
ware engineers [5–8], and leveraging SE practices Our LDA-based topic analysis can provide valu-
[1]. In most of the above-mentioned studies, the able contributions to better understanding of the
traditional content analysis techniques (without changing nature of the SE trends. The findings of
sematic analysis or probabilistic topic modeling this study can be helpful for (1) software engineers
approaches) were used to discover industry needs to evaluate and update their individual capabilities,
and trends. In this dynamic framework, more (2) software corporations to select and employ the
research is needed to analysis and interpretation of qualified software engineers, (3) educational institu-
these emerging needs and trends [2]. In particular, tions to design SE programs and core curriculum
the supplementary studies based on generative consistent with emerging needs and trends, and (4)
models, semantic analysis, and topic modeling will students interested in SE to design their future
contribute to SE research and practice in approved careers. The rest of the paper is organized as follows.
manner. The research methodology and research data are
In this study, we investigated the emerging roles, included in Section 2. The results are shown in
responsibilities, trends, and demands for software Section 3. The findings are discussed in Section 4.
engineers by analyzing the texts of the SE job Finally, the conclusions, limitations and future
postings. The background of this study was based work are given in Section 5.
five focal points: (1) identification of the profes-
sional roles and responsibilities of software engi- 2. Research methodology
neers, (2) determination of the most in-demand
combinations of the programming languages used The research methodology of this study was based
in today’s software development environments, (3) on semantic topic analysis of the SE job postings
identification of the educational requirements for using LDA-based topic modeling, a quantitative
software engineers, (4) detection of trending topics approach to analyze qualitative data [15, 16]. The
at a high-granularity level in the SE jobs, and finally methodology was designed according to the focal
in light of these findings, (5) providing of valuable points of the study and consisted of a number of
contributions and insights for the design of an sequential phases as shown in Fig. 1. Initially, the
innovative SE curriculum consistent with the emer- job postings were collected and the dataset was
ging trends and demands in the software industry. created. Next, the data preprocessing steps were
Based on this purpose and scope, an empirical topic implemented to dimensional reduction and to
analysis was implemented on SE jobs using a gen- increase the success of the analysis. In order to
erative topic-modeling approach called as latent perform the numerical analysis on the dataset, the
Dirichlet allocation (LDA) [15]. In this analysis document-term matrix (DTM) were created. After
performed by LDA, the 30 latent topics were dis- this process, the semantic analysis was performed by
covered at optimal level and these topics have implementing LDA-based topic modeling on the
enabled us to carry out the qualitative and quanti- DTM to discover latent topics. Finally, the results
tative evaluations about the SE trends. The findings of the analysis were presented and empirical find-
of the study demonstrated that today’s software ings were discussed in light of related studies. In the
engineers are expected to undertake the wide spec- subsequent section, each phase of the methodology
trum of roles and responsibilities. From this point of is described in more detail.
view, the software engineers are characterized by the
roles and different combinations of the responsibil- 2.1 Data collection and preprocessing
ities (in-demand skills). The topics discovered by Although there are numerous tech-focused job
LDA highlighted a broad range of the character- boards, considering the purpose of the study,

Fig. 1. An overview of the research methodology.


Analysis of Software Engineering Industry Needs and Trends: Implications for Education 1363

Stack Overflow Careers [17] was selected as the data these reasons, LDA is the most suitable method for
source. In this selection, two main criteria were the detection of trending topics in our empirical
taken into account for the study. The first criterion dataset.
was that the board selected as a data source should In our experiments, we used the LDA implemen-
be only related to SE field. The second criterion was tation of the MALLET [22] open source software
that the board includes the jobs from different that is used for statistical natural language proces-
countries. Besides, Stack Overflow [18] is a popular sing and the topic modeling. MALLET uses Gibbs
question-answer sharing platform and an intensive sampling algorithm [23] for parameter estimation.
interaction platform used by software engineers We implemented the MALLET with 1500 iterations
[13]. For this reason, the job postings on this of Gibbs sampling for each experiment. The number
board are followed and discussed by the engineers of topics is ranging between 15 and 50 to achieve an
in a comprehensive manner. In this context, our optimal setting [21]. The desired inferences were
dataset consisted of 2533 unique SE job postings. achieved when the number of topics was set to 30.
Time period of the data was six months, from Bigram Topic Model: Essentially, LDA uses ‘‘bag
January 2016 to June 2016. In the data set, a typical of words’’ assumption that does not take into
job posting contained various information such as account the order of words [16]. However, the
roles, responsibilities, location, major skills, order and proximity of words is significant for
requirements, and job description. specific semantic analysis. In this sense, bigram
After the data collection phase, the data prepro- topic model is considered as an implementation of
cessing was performed on the textual dataset. Data LDA by incorporating the order of words [24]. The
preprocessing is an essential process used to increase bigram topic model is generally used to uncover
the quality of the analysis in text mining and semantic relations between words. For example,
information retrieval [19]. In this context, the pre- ‘‘web developer’’ words are processed as a bigram
processing performed in this study consisted of four or two individual unigrams, ‘‘web’’ and ‘‘develo-
steps implemented consecutively. That is, tokeniza- per’’. The meaning of ‘‘web’’ is different from the
tion, lowercase conversion, deleting special charac- meaning of ‘‘web developer’’. In this model, each
ters and stop words (is, a, the, for etc.). On the other word is evaluated with the characteristics of the
hand, no stemming process was implemented in previous word, so the model has increased the
order to avoid any loss of sense for the reason that semantic accuracy of the topic modeling [24]. In
the textual context of the empirical data consisted of this study, trigram word distributions were used as
numerous technical terms. In the last step prior to bigram topic model in order to identification of the
the topic analysis, the textual data was converted triple combinations of programming languages.
into DTM in order to perform numerical topic Bigram model can be implemented to the data
analysis on the data. The DTM is a two-dimensional using parameter of ‘‘--keep-sequence-bigrams
mathematical matrix that describes the frequency of TRUE’’ in the MALLET.
words that occur in a collection of documents. In the
DTM, rows represent the documents in the collec- 3. Results
tion and columns represent the words in a docu-
ment. The DTM is commonly used as the primary 3.1 The roles and responsibilities of software
input in the text mining process [20]. After the engineers
preprocessing steps described above, the dataset As a result of the analysis, the 20 main roles were
was represented by the lower dimensional DTM. identified for software engineers. The roles are
sorted in descending order of their frequency of
2.2 Topic analysis and interpretation occurrences and presented in Table 1. According
Text documents consist of latent semantic struc- to the findings, the highest in-demand role was
tures, which are called ‘‘topics’’. In topic analysis, ‘‘Software Engineer’’ (12.4%), followed by
each text document is represented by a combination ‘‘Mobile Developer’’ (10.6%), and ‘‘Frontend
of topics and each topic is represented by frequently Developer’’ (9.0%). The roles, that appear with
co-occurring words having a probability distribu- low-frequency (the frequency was smaller than
tion [15, 16]. In this study, latent Dirichlet allocation 1%) or misspelled were assigned to their nearest
(LDA) [15], a generative topic modeling approach, group. We considered that the findings are descrip-
was used to discover emerging needs and trends in tive for the roles and responsibilities of software
the software industry. LDA-based topic modeling is engineers presented in the table. Therefore, there is
effectively used for the semantic analysis of docu- no need to additional descriptive information about
ment collections in text mining. Learning in LDA the role definitions. The roles and responsibilities
model is unsupervised, so millions of textual docu- offer a descriptive information about the areas of
ments can be analyzed in a short time [16, 21]. For expertise in the SE field.
1364 Fatih Gurcan and Cemal Kose

Table 1. Distribution of the roles and responsibilities

Role Responsibilities (in-demand skills) Rate %

Software Engineer java, python, c#, javascript, c++ 315 12.4%


Mobile Developer android, ios, mobile, unity, objective-c 267 10.6%
Frontend Developer javascript, html5, css, java, angularjs 227 9.0%
Software Developer java, c#, c++, javascript, python 179 7.1%
Full Stack Engineer java, python, javascript, c++, c# 159 6.3%
Data Engineer sql, mysql, oracle, hadoop, java 148 5.8%
Backend Developer php, java, mysql, python, nodejs 143 5.6%
DevOps Engineer linux, aws, devops, puppet, java 128 5.1%
Java Developer java, spring, sql, hadoop, javascript 113 4.4%
Cloud Systems Engineer cloud, linux, java, c#, aws 103 4.0%
Web Developer html, javascript, css, angularjs, jquery 94 3.7%
Ruby on Rails Developer ruby on rails, ruby, python, backbonejs, nodejs 91 3.6%
System Engineer linux, windows, python, ruby, java 88 3.5%
UI/UX Developer ui, javascript, ux, html, css 83 3.3%
JavaScript Developer javascript, html, css, angularjs, nodejs 81 3.2%
Python Developer python, django, postgresql, php, jquery 76 3.0%
C++ Developer c++, c#, .net, sql, c 71 2.8%
Quality Assurance Engineer qa, sql, testing, java, c++ 67 2.6%
Application Developer java, c++, .net, ruby, linux 53 2.1%
.Net Developer .net, c#, java, asp.net, oop 46 1.8%

Total 2533 100%

As for the responsibilities of software engineers, require the knowledge of multiple programming
the 476 different skills were identified in various languages together.
areas of expertise for software engineers. Given the
frequency of the in-demand skills, the top five most 3.3 The educational requirements for software
required skills were identified. Sorting of the top engineers
skills is as follows: Java (21%), JavaScript (18%), In this phase, the dataset was analyzed to determine
Python (12%), Html (11%), C++ (8%). The findings the educational requirements for software engi-
indicated that the programming languages are the neers. The obtained results were associated with
core competencies of the SE field, and also the the roles that previously achieved. With this way,
dominance of scripting programming languages is the frequency of educational requirements was
remarkable. For clarity, the responsibilities were calculated for each of the roles and demonstrated
clustered according to the roles and presented in in Table 3. According to the findings, quality assur-
Table 1. The skills are sorted in descending order of ance engineer has the highest rate (76%), whereas
their frequency of occurrences for each role. For ruby on rails developer has the lowest rate (21%) in
example, the required skills for software engineers
are sorted in such as ‘‘java’’, ‘‘python’’, ‘‘c#’’, ‘‘java- Table 2. The top 20 most in-demand triple combinations of
script’’ and ‘‘c++’’. That means that ‘‘java’’ is the programming languages
first most important skill, ‘‘python’’ is the second
The Combination Rate
most important skill and ‘‘c#’’ is the third most
important skill for the software engineers. html, css, javascript 12.4%
html, javascript, sql 6.88%
.net, asp.net, c# 6.61%
3.2 The most in-demand combinations of the .net, asp.net, sql 6.50%
programming languages javascript, jquery, html 5.76%
java, javascript, sql 5.70%
Based on the findings on the in-demand skills given java, javascript, html 5.68%
previous section, the most in-demand triple combi- java, javascript, css 5.20%
nations of the programming languages were identi- html, java, css 4.86%
.net, asp.net, javascript 4.81%
fied via trigram topic model. These combinations python, java, c++ 4.43%
are sorted in descending order of their frequency of java, html, sql 4.09%
occurrences and presented in Table 2. According to java, javascript, jquery 3.67%
java, javascript, xml 3.64%
the findings, the highest in-demand triple combina- python, java, sql 3.59%
tion was ‘‘html, css, javascript’’ (12.4%), followed ruby, java, python 3.59%
by ‘‘html, javascript, sql’’ (6.88%), and ‘‘.net, javascript, java, spring 3.22%
c#, c++, java 3.19%
asp.net, c#’’ (6.61%). As understood from the python, java, javascript 3.14%
table, knowledge of only one programming lan- .net, asp.net, jquery 3.07%
guage is not enough for the software engineers.
Total 100%
Today’s software development environments
Analysis of Software Engineering Industry Needs and Trends: Implications for Education 1365

Table 3. The educational requirements for the roles related majors. A degree in computer science or
Role Rate related fields was preferred in the majority of job
postings. According to the topic analysis outcomes
Quality Assurance Engineer 76%
UI/UX Developer 64%
that presented in the following section, the most
Software Engineer 63% common descriptive keywords related to this topic
Cloud Systems Engineer 60% are listed as follows: computer, science, degree,
Java Developer 59%
Data Engineer 52%
related, engineering, field, ms, and bachelor.
C++ Developer 51%
Mobile Developer 47%
DevOps Engineer 44% 3.4 Trending topics in the SE field
Python Developer 41%
Software Developer 41%
Determination of trending topics in the SE field was
System Engineer 40% an empirical process and it required semantic topic
.Net Developer 39% analysis on different levels. To this end, we imple-
Backend Developer 36%
JavaScript Developer 34%
mented LDA-based topic modeling approach to
Application Developer 32% discover the trending topics in this field. Thus, the
Full Stack Engineer 31% 30 most trending topics were discovered as a result
Frontend Developer 29%
Web Developer 27%
of the analysis. The discovered topics and related
Ruby on Rails Developer 21% keywords are presented in Table 4.
The topic names were manually assigned to
Total 45%
briefly describe each topic. The top eight descriptive
keywords related to the topics are determined and
terms of educational requirements. In total, the sorted in descending order of frequency for each
educational requirements were demanded in 45% topic. According to the topics, the areas of expertise
of all the SE jobs. This finding indicated a consider- in SE covered a wide variety of up-to-date skill sets.
able gap (55%) between the software industry and Some of these included; distributed systems, real-
academia. Educational requirements for the soft- time processing, cloud-based development, open
ware engineers typically contained a bachelor’s or a source development and scripting languages. The
master’s degree in computer science, information discovered topics also highlighted the many emer-
science, engineering, information systems or other ging trends such as the programming languages,

Table 4. The top 30 topics discovered by LDA

Topic Name Top LDA words Rate

Communication Skill skills communication written english ability verbal excellent oral 15.9%
Educational Requirements computer science degree related engineering field ms bachelor 11.0%
Database Technologies database sql git mysql nosql relational oracle sqlserver 8.1%
Scripting Languages java python languages php ruby language programming scripting 4.6%
JavaScript Frameworks javascript html css jquery frontend angularjs ajax bootstrap 4.3%
Web Services web service api rest json based backend xml restful technology 3.8%
Cloud Management tool management cloud process automation aws build 3.6%
Mobile Development android mobile app ios developer application platform objective 3.6%
System Administration system linux required windows admin operating unix command 3.3%
Software Testing testing tools integration test automation build unit tdd quality 3.3%
Teamwork Skills work team ability problem solving strong motivated player sense 3.1%
Object Oriented Programming experience object oriented programming java professional year role 2.9%
Code Writing Quality code writing quality develop automated maintaining test clean 2.8%
Multi-platform Development multi multiple platform ios web android mobile development 2.7%
Open-source Development open source system online development software driven github 2.3%
Networking Security problems solve security network system complex issues http 2.3%
JavaScript Libraries javascript js nodejs modern libraries similar spring angularjs django 2.2%
Quality Assurance technical quality engineer support QA training level optimize 2.1%
Project Management project process strong managing github drive team good personal 2.1%
Model-View-Controller technology working including mvc c# large scale .net c++ 2.0%
Business Solutions business solution product designing communicate internal customer 1.8%
Configuration Management customer product puppet chef configuration support operations 1.8%
Distributed Systems distributed scalable cloud open system high building performance 1.7%
Big Data Processing big data learning algorithms processing record hadoop structures 1.5%
Real-Time Processing real time process data project stream analytics make 1.3%
Software Designs software development patterns practices principles concepts skills 1.3%
User Interface Design design user interface coding responsive analysis complex create 1.3%
Agile Development Model development agile test continuous integration driven practices scrum 1.3%
Cloud-based Development cloud developer java python service relevant practical desirable 1.2%
Software Development Cycle development software lifecycle method enterprise process cycle 0.9%

Total 100%
1366 Fatih Gurcan and Cemal Kose

skills, tools, platforms, competencies and technolo- The third finding demonstrated the educational
gies that indicate priorities in this ever-growing requirements for professional roles of software
software industry. engineers. The educational requirements varied
according to the roles as outlined in the results
4. Discussion section. In total, the requirements were demanded
in the 45% of the software jobs. This rate indicated
Our analysis revealed the vocational qualifications, that there is a considerable gap (%55) between
combinations of programming languages, educa- software industry and academia, as discussed in
tional requirements, and trending topics demanded several studies previously [2, 6–9, 25]. The academic
in the dynamic SE field. The findings of this analysis institutions have a crucial responsibility to close this
were presented in detail in the previous section. In gap. In light of the findings of similar studies, the SE
this section, the findings are discussed in light of curricula can be modernized by taking into account
related studies. the emerging needs. In this regard, these and similar
The first finding was that a significant evolution implications can provide valuable insights for train-
was observed in the professional roles of software ing of software engineers according to up-to-date
engineers. These roles have an increasing diversity industry needs [2, 7, 9, 25, 26].
over time, considering the study conducted by The fourth finding revealed that the discovered
Litecky et al. [5]. In particular, expanding coverage topics reflect the main themes and trends in the SE
of mobile and web applications have increased the field. These topics had many conceptual and inter-
diversity of the professional roles. Most of the roles pretable inferences. One of these was the SE field has
such as Frontend Developer, Full Stack Engineer an important evolution between emerging technol-
and Cloud System Engineer, etc. have emerged ogies and declining technologies. The topics out-
recently. A notable outcome is that many program- lined a wide range of skills, areas of expertise,
ming languages are directly demanded as the roles. working environments in this field. Besides many
For example, Python developer, Java developer, non-technical skills, interpersonal skills, personal
JavaScript developer, Ruby on Rails developer, skills, and organizational skills were observed as
and C++ developer etc. Another notable outcome well as technical skills in the topics. The wide cover-
is that the developer title is often preferred, instead age of the topics was emphasized in a similar study
of programmer title that was widely used pre- based on the analysis of big data jobs using Latent
viously, as indicated in the study conducted by Semantic Analysis method [27]. In the previous
Chen et al. [4]. Developer, engineer, and adminis- studies that rely on the analysis of the job postings,
trator titles are used in defining professional roles of keyword indexing approaches were frequently used
software engineers. as a content analysis method [7]. Therefore, supple-
The second finding was that the in-demand skills mentary studies based on probabilistic topic model-
for software engineers have a wide range of diver- ing are needed to assess the effectiveness of our
sity, as stated in other studies [5, 13]. As a result of methodology.
the analysis, the 476 different skills were extracted.
The range of the skills illustrates the boundaries of 5. Conclusions
software industry. This means that the software
engineers should have various combinations of the In this study, our main objective is to analyze the SE
skills in today’s progressive software market. Due to industry needs and trends, and to reveal the implica-
the developments in the software industry, the in- tions for education in this dynamic field. To this end,
demand skills especially programming languages we conducted an empirical analysis to provide
and their specific combinations are ever-changing valuable insights and contributions to SE educa-
over time, and being an integral part of the SE field. tion. The methodology of this study is based on
Despite increasing diversity in programming lan- semantic topic analysis of the SE job postings using
guages, the dominance of Java in the last decade is LDA model, a probabilistic topic modeling
remarkable. Java is followed by JavaScript and approach, which used to discover the latent seman-
Python, respectively. The widespread use of Java- tic patterns called as topics in order to identify
Script libraries and Html5 increases the diversity of emerging needs and trends in the dynamic software
the web developer skills. The importance of multi- industry. In this context, the findings of this study
platform applications on the platform of web, were: (1) As leading actors, software engineers have
mobile, social, and cloud are increasing day by a wide spectrum of professional roles and responsi-
day as the emerging platforms. In this regard, the bilities in the software industry. (2) Today’s soft-
multi-platform applications will provide new carrier ware development environments require the
opportunities for the software engineers in the effective usage of specific combinations of the pro-
future [13]. gramming languages. (3) In terms of educational
Analysis of Software Engineering Industry Needs and Trends: Implications for Education 1367

requirements, the software engineers are desired to needs, Journal of systems and software, 85(7), 2012, pp.
1607–1620.
have at least a bachelor’s degree in approximately 3. A. Mishra and D. Mishra, Industry Oriented Advanced
half of all software jobs, and this finding underlined Software Engineering Education Curriculum, Croatian Jour-
the notable gap between the software industry and nal of Education, 14(3), 2012, pp. 595–624.
4. Y. Chen, R. Dios, A. Mili, L. Wu and K. Wang, An empirical
academia. (4) The topics discovered via LDA study of programming language trends, Software, IEEE,
revealed the required qualifications for software 22(3), 2005, pp. 72–79.
engineers as well as the emerging needs and trends 5. C. Litecky, A. Aken, A. Ahmad and H. J. Nelson, Mining for
computing jobs, Software, IEEE, 27(1), 2010, pp. 78–85.
in dynamic SE field. 6. B. Prabhakar, C. R. Litecky and K. Arnett, IT skills in a
At individual level, our findings can be helpful for tough job market, Communications of the ACM, 48(10),
software engineers to evaluate and update their own 2005, pp. 91–94.
7. D. Smith and A. Ali, Analyzing Computer Programming Job
skills, the instructors to train qualified software Trend Using Web Data Mining, Issues in Informing Science
engineers, and the students to plan their future and Information Technology, 11, 2014, pp. 203–214.
careers in this field. At institutional level, the find- 8. Y. Kim, J. Hsu and M. Stern, An update on the IS/IT skills
gap, Journal of Information Systems Education, 17(4), 2006,
ings may provide guidance to software companies p. 395.
in selection of qualified software engineers, and 9. I. Aldmour, A New Computer Engineering Curriculum
academic institutions in meeting the need for well- Based on Technology Expansion to Address the Needs of
Developing Communities, International Journal of Engineer-
trained workforce for software industry. Consider- ing Education, 30(6), 2014, pp. 1590–1601.
ing the findings of this study, an innovative aca- 10. S. Chandrasekaran, A. Stojcevski, G. Littlefair and M.
demic curriculum for SE education can be designed Joordens, Project-oriented design-based learning: aligning
students’ views with industry needs, International journal of
consistent with the industry needs and trends. engineering education, 29(5), 2013, pp. 1109–1118.
Furthermore, the research methodology can be 11. H. Artail, A methodology for combining development and
used for semantic content analysis on different research in teaching undergraduate SE, International Journal
of Engineering Education, 24(3), 2008, pp. 567–580.
dataset such as forums, online communities, blogs, 12. T. Abraham, C. Beath, C. Bullen, K. Gallagher, T. Goles, K.
social networks, etc. In summary, the findings of Kaiser and J. Simon, IT workforce trends: Implications for
this study may provide valuable contributions into IS programs, Communications of the Association for Informa-
tion Systems, 17(1), 2006, p. 50.
the SE field. 13. A. Barua, S. W. Thomas and A. E. Hassan, What are
As in all research, this study was constrained by developers talking about? an analysis of topics and trends
several limitations. The findings were based on in stack overflow, Empirical SE, 19(3), 2014, pp. 619–654.
14. P. A. Laplante, What every engineer should know about SE.
snapshots of the SE trends covering the six- CRC Press, 2007.
months period from January 2016 to June 2016. 15. D. M. Blei, A. Y. Ng and M. I. Jordan, Latent dirichlet
The analysis was limited only by English job post- allocation, The Journal of Machine Learning Research, 3,
2003, pp. 993–1022.
ings. The methodology could not be applied to 16. D. M. Blei, Probabilistic topic models, Communications of
multi-languages due to the nature of the analysis the ACM, 55(4), 2012, pp. 77–84.
performed in the study. This study can serve as a 17. Stack Overflow Careers, http://careers.stackoverflow.com,
Accessed 28 July 2016.
basis for future studies in the multi-languages. 18. Stack Overflow, http://stackoverflow.com, Accessed 21 July
Finally, our study was based on empirical analysis 2016.
using the combination of many parameters. For this 19. M. Kantardzic, Data mining: concepts, models, methods, and
algorithms, John Wiley & Sons, 2011.
reason, selection of the optimal parameters is a main 20. M. A. Pathak, Beginning Data Science with R, Springer, 2014.
task that is still an open problem for LDA. In 21. T. L. Griffiths and M. Steyvers, Finding scientific topics,
particular, determination of the optimal number Proceedings of the National Academy of Sciences,
101(suppl. 1), 2004, pp. 5228–5235.
of topics required several experimental processes. 22. A. McCallum, (2002) MALLET: a machine learning for
Thus, further confirmatory research is needed to language toolkit, http://mallet.cs.umass.edu, Accessed 29
validate and refine our results. Our methodology Jun 2016
23. S. Geman and D. Geman, Stochastic relaxation, Gibbs
can be improved with new supportive approaches in distributions, and the Bayesian restoration of images,
order to perform context-sensitive semantic analy- Pattern Analysis and Machine Intelligence, IEEE Transac-
sis in more efficient levels. In future study, we plan to tions on, 6, 1984, pp. 721–741.
24. H. M. Wallach, Topic modeling: beyond bag-of-words, In
extend our methodology using different topic mod- Proceedings of the 23rd International Conference on Machine
eling techniques, and implementing on learning Learning, ACM, 2006, pp. 977–984.
analytics data to perform a more precise evaluation 25. G. V. B. Subrahmanyam, A dynamic framework for software
engineering education curriculum to reduce the gap between
of SE education. the software organizations and software educational institu-
tions, In 2009 22nd Conference on Software Engineering
References Education and Training, IEEE, 2009, pp. 248–254.
26. N. E. Gibbs and R. E. Fairley (Eds), Software Engineering
1. K. A. Cary, The software enterprise: Practicing best practices Education: The Educational Needs of the Software Commu-
in software engineering education, International Journal of nity, Springer Science & Business Media, 2012.
Engineering Education, 24(4), 2008, p. 705. 27. S. Debortoli, O. Müller and J. vom Brocke, Comparing
2. A. M. Moreno, M. I. Sanchez-Segura, F. Medina-Domin- Business Intelligence and Big Data Skills, Business & Infor-
guez,and L. Carvajal, Balancing SE education and industrial mation Systems Engineering, 6(5), 2014, pp. 289–300.
1368 Fatih Gurcan and Cemal Kose

Fatih Gurcan is an instructor in Department of Informatics, and PhD student in Department of Computer Engineering at
Karadeniz Technical University, Trabzon, Turkey. He received BS degree in Department of Statistics and Computer
Sciences, and MS degree in Department of Computer Engineering from Karadeniz Technical University, Trabzon,
Turkey. His research interests include trend analysis, sentiment analysis, statistical topic modeling, engineering education,
big data analytics, and text mining.

Cemal Kose is a Professor and Head of the Computer Engineering Department at Karadeniz Technical University,
Trabzon, Turkey. He received BS, and MS degrees in the Department of Electrical and Electronic Engineering from
Karadeniz Technical University, Trabzon, Turkey. He received the PhD degree in the Department of Computer Science
from the University of Bristol, Bristol, UK. His research interests include text mining, information retrieval, topic analysis,
image processing, pattern recognition, parallel computers, and computer graphics.

View publication stats

You might also like