This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
統計データ分析コンペティション * 元データは100列ぐらいあるもの。 * 必要な項目を抽出(都道府県番号、総人口、居住可能面積、総事業所数) * 面積当たり人口、面積当たり事業所数、それの対数データは計算にて求める
data1 <- read.csv("SSDSE_practice.csv",header=TRUE,sep="\t")
data1$log_pop_per_area <- with(data1, log(pop_per_area))
data1$log_office_per_area <- with(data1, log(office_per_area))
data2 <- data1 %>%
dplyr::rename('都道府県番号'=prefnum,'総人口'=population,'居住可能面積'=area,'総事業所数'=office,'面積当たり人口'=pop_per_area,'面積当たり事業所数'=office_per_area,'対数面積当たり人口'=log_pop_per_area,'対数面積当たり事業所数'=log_office_per_area)
data2
ggplot(data1,aes(x=office_per_area,y=pop_per_area),title="散布図(実数ベース)",
labs(x="面積当たり事業所数",y="面積当たり人口")) +
# 軸と領域を描画。
geom_point() + # 散布図を描く
geom_smooth(method="lm",se=FALSE) # 回帰直線を描く
ggplot(data1,aes(x=log_office_per_area,y=log_pop_per_area),title="散布図(対数ベース)",labs(x="面積当たり事業所数",y="面積当たり人口")) +
# 軸と領域を描画。
geom_point() + # 散布図を描く
geom_smooth(method="lm",se=FALSE) # 回帰直線を描く
cor.test(data1$office_per_area, data1$pop_per_area, alternative="two.sided",
method="pearson")
##
## Pearson's product-moment correlation
##
## data: data1$office_per_area and data1$pop_per_area
## t = 42.32, df = 45, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.9778438 0.9931522
## sample estimates:
## cor
## 0.9876689
cor.test(data1$log_office_per_area, data1$log_pop_per_area, alternative="two.sided", method="pearson")
##
## Pearson's product-moment correlation
##
## data: data1$log_office_per_area and data1$log_pop_per_area
## t = 61.199, df = 45, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.9892747 0.9966983
## sample estimates:
## cor
## 0.9940461
本日はここまで。。。