`KLR_predict()` is a function to predict the probability of site presence to a new list of data based on the fitted alpha parameters returned from the `KLR()` funtion.
KLR_predict(test_data, train_data, alphas_pred, sigma, progress = TRUE, dist_metric = "euclidean")
test_data | - [list] Training data used to create similarity kernel matrix |
---|---|
train_data | - [list] Testing data to predict class |
alphas_pred | - [vector] Numeric vector of alpha parameters from KLR function |
sigma | - [scaler] Smoothing parameter for RBF kernel |
progress | - [logical] False = no progress bar; 1 = show progress bar |
dist_metric | [character] One of the distance methods from rdist::cdist. Default = "euclidean". see ?rdist::cdist |
- [vector] - predicted probabiity of positive class
This function takes a list of the `test_data`, a list of the `train_data`, a vector of the approximated alpha parameters as `alpha_pred`, a scalar value for the `sigma` kernel hyperparameter, and a distance method (deafult = "Euclidean"). This function predicts the probability of site presence for new observations based on the training data and `alphas` parameters. This is accomplished by building the `k*k` kernel matrix as the similarity between the training test data then computing the inverse logit of `k*k
# NOT RUN { sim_data <- get_sim_data(site_samples = 800, N_site_bags = 75, sites_var1_mean = 80, sites_var1_sd = 10, sites_var2_mean = 5, sites_var2_sd = 2, backg_var1_mean = 100,backg_var1_sd = 20, backg_var2_mean = 6, backg_var2_sd = 3) formatted_data <- format_site_data(sim_data, N_sites=10, train_test_split=0.8, sample_fraction = 0.9, background_site_balance=1) train_data <- formatted_data[["train_data"]] train_presence <- formatted_data[["train_presence"]] test_presence <- formatted_data[["test_presence"]] ##### Logistic Mean Embedding KLR Model #### Build Kernel Matrix K <- build_K(train_data, sigma = sigma, dist_metric = dist_metric) #### Train train_log_pred <- KLR(K, train_presence, lambda, 100, 0.001, verbose = 2) #### Predict test_log_pred <- KLR_predict(test_data, train_data, dist_metric = dist_metric, train_log_pred[["alphas"]], sigma) # }