Analyse des données chez les couples

library(DMwR2)
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
library(ggplot2)
library(tidyr)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(visdat)
library(plotly)

Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

    last_plot
The following object is masked from 'package:stats':

    filter
The following object is masked from 'package:graphics':

    layout
library(DescTools)
library(RColorBrewer)
library(ggcorrplot)

setwd('/Users/mathisbouvet/Documents/Stage/Données Analytiques/Données Analytique')
APC=read.csv('Couple2.csv',sep=";",header=TRUE)
APC$Fertilité.Homme=as.factor(APC$Fertilité.Homme)
APC$Fertilité.Femme=as.factor(APC$Fertilité.Femme)
APC$Fertilité.couple=as.factor(APC$Fertilité.couple)

Partie I : Traitement des données brutes

'data.frame':   146 obs. of  17 variables:
 $ Primary.ID      : chr  "001PT" "002AM" "004VL" "005FJ" ...
 $ Score.femme     : num  0.339 0.475 0.182 0.436 0.469 ...
 $ Score.homme     : num  0.319 0.315 0.435 0.351 0.432 ...
 $ Score.couple    : num  0.28 0.338 0.283 0.288 0.458 ...
 $ Fertilité.Homme : Factor w/ 2 levels "Non","Oui": 1 1 1 1 1 1 1 1 1 1 ...
 $ Fertilité.Femme : Factor w/ 2 levels "Non","Oui": 1 1 1 1 1 1 1 1 1 1 ...
 $ Fertilité.couple: Factor w/ 2 levels "Non","Oui": 1 1 1 1 1 1 1 1 1 1 ...
 $ HTempsAssis     : int  360 300 180 720 480 420 720 240 600 600 ...
 $ HMarcheMET      : num  NA NA 0 0 165 165 660 0 330 990 ...
 $ HModereMET      : int  0 0 9600 0 6000 4800 900 6000 1200 1200 ...
 $ HIntensiteMET   : int  0 800 0 3600 4800 0 3680 0 1200 0 ...
 $ HTotalMET       : num  0 800 9600 3600 10965 ...
 $ FTempsAssis     : int  300 240 180 300 840 240 780 120 600 240 ...
 $ FMarcheMET      : num  742 0 495 0 165 ...
 $ FModereMET      : int  1000 4800 8400 1000 600 1200 600 12000 2400 3600 ...
 $ FIntensiteMET   : int  0 0 0 0 0 2400 3600 0 0 0 ...
 $ FTotalMET       : num  1742 4800 8895 1000 765 ...
d_numeric <- APC[, sapply(APC, is.numeric)]

1. Traitement des valeurs manquantes

nrow(d_numeric[!complete.cases(d_numeric),])
[1] 39

Imputation de données manquantes par les K plus proches voisins

Au vu du faible nombre de donnée, un K trop élevé peut inclure des voisins trop éloignés et diluer la précision de la prédiction. La moyenne pondéré est attribuée à la valeur manquante

d1 <- knnImputation(d_numeric, k = 7, scale = TRUE, meth = "weighAvg")

2. Traitement des valeurs aberrantes et extrêmes

Une valeurs aberrantes, ou extrême, est une observation distante des autres sur un phénomène semblable. L’argument d’une données issus d’une déclaration est suffisant pour imputer ces valeurs.

df_long <- pivot_longer(d1, cols = everything(), names_to = "Variable", values_to = "Valeurs")
ggplot(df_long, aes(x = Variable, y = Valeurs)) +
  geom_boxplot(fill = "lightblue", color = "darkblue") +
  labs(title = "Boxplots pour chaque variable", x = "Variables", y = "Valeurs") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Des valeurs extrêmes sont visualisées dans l’ensemble des données liées à l’activité physique ainsi que sur les données de marche.

Technique d’imputation de données abérrantes par winzorisation

Pour éviter de supprimer les valeurs, on utilise la technique de winzorisation pour ramener les valeurs dans les limites des boîtes à moustache.

d1$FIntensiteMET<-Winsorize(d1$FIntensiteMET)
d1$FMarcheMET<-Winsorize(d1$FMarcheMET)
d1$FModereMET<-Winsorize(d1$FModereMET)
d1$FTempsAssis<-Winsorize(d1$FTempsAssis)
d1$FTotalMET<-Winsorize(d1$FTotalMET)
d1$HIntensiteMET<-Winsorize(d1$HIntensiteMET)
d1$HMarcheMET<-Winsorize(d1$HMarcheMET)
d1$HModereMET<-Winsorize(d1$HModereMET)
d1$HTotalMET<-Winsorize(d1$HTotalMET)

3. Distribution des données

Hypothèse H0 : les données d’activitées physiques et le score suivent une distribution de loi normale

Voir les résultats
                   Variable      P_value
Score.femme     Score.femme 3.095685e-01
Score.homme     Score.homme 8.471204e-01
Score.couple   Score.couple 9.872415e-02
HTempsAssis     HTempsAssis 1.094362e-02
HMarcheMET       HMarcheMET 8.357670e-11
HModereMET       HModereMET 4.780494e-13
HIntensiteMET HIntensiteMET 1.195418e-12
HTotalMET         HTotalMET 8.092908e-07
FTempsAssis     FTempsAssis 2.106831e-04
FMarcheMET       FMarcheMET 1.033959e-13
FModereMET       FModereMET 3.287123e-13
FIntensiteMET FIntensiteMET 8.886192e-18
FTotalMET         FTotalMET 5.183212e-09

A part les scores de fertilité, il n’y a aucune données d’activité physique qui présente une distribution normale.

Partie II : Corrélation entre les scores

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties

Warning in cor.test.default(x, y, method = "spearman"): Cannot compute exact
p-value with ties
colors <- brewer.pal(n = 9, name = "Accent")  # Palette "RdYlBu" de RColorBrewer
Warning in brewer.pal(n = 9, name = "Accent"): n too large, allowed maximum for palette Accent is 8
Returning the palette you asked for with that many colors
ggcorrplot(cor_matrix, 
           method = "circle", 
           type = "lower", 
           lab = TRUE, 
           lab_size = 3, 
           show.legend = TRUE,
           title = "Matrice de Corrélation avec Significativité",  
           ggtheme = theme_minimal(), 
           p.mat = p_matrix, 
           sig.level = 0.05, 
           insig = "blank", 
           colors = colors)

Visualisation en 3D

model <- lm(Score.couple ~ Score.homme+ Score.femme, data = d1)
plot_ly(data = d1, 
        x = ~Score.homme, 
        y = ~Score.femme, 
        z = ~Score.couple, 
        type = "scatter3d", 
        mode = "markers", 
        marker = list(size = 5, color = ~Score.couple, colorscale = 'Viridis', showscale = TRUE)) %>%
  add_trace(x = ~Score.homme, 
            y = ~Score.femme, 
            z = fitted(model), 
            type = "scatter3d", 
            mode = "lines", 
            line = list(width = 4, color = 'red'))
A marker object has been specified, but markers is not in the mode
Adding markers to the mode...