Usage
import copepodTCR as cpp
To use the package for basic tasks, the Quickstart section is enough. To read more about used functions, check other sections.
Quickstart
import copepodTCR as cpp
# number of pools
n_pools = 12
# peptide occurrence
iters = 4
# number of peptides
len_lst = 253
# address arrangemement
b, lines = cpp.address_rearrangement_AU(n_pools=n_pools, iters=iters, len_lst=len_lst)
# add your peptides to lst
lst = list(pd.read_csv('peptides.csv', sep = "\t"))
# pooling scheme generation
pools, peptide_address = cpp.pooling(lst=lst, addresses=lines, n_pools=n_pools)
# simulation
check_results = cpp.run_experiment(lst=lst, peptide_address=peptide_address, ep_length=8, pools=pools, iters=iters, n_pools=n_pools, regime='without dropouts')
# STL files generation
# add peptide scheme to peptides_table_stl, with header and index as column and row numbers
peptides_table_stl = pd.read_csv('peptides_scheme.tsv', sep = "\t", index_col = 0)
pools_df = pd.DataFrame({'Peptides': [';'.join(val) for val in pools.values()]}, index=pools.keys())
meshes_list = cpp.pools_stl(peptides_table = peptides_table_stl, pools = pools_df, rows = 16, cols = 24, length = 122.10, width = 79.97,
thickness = 1.5, hole_radius = 2, x_offset = 9.05, y_offset = 6.20, well_spacing = 4.5)
cpp.zip_meshes_export(meshes_list)
# Results of the experiment as a table with two columns, Pool and Percentage. Activation signal is expressed in percentaged of activated T cells.
exp_results = pd.read_csv('path/to/your/file')
cells = list(exp_results['Percentage'])
inds = list(exp_results['Pool'])
# Model
fig, probs = cpp.activation_model(cells, n_pools, inds)
peptide_probs = cpp.peptide_probabilities(sim, probs)
message, most, possible = cpp.results_analysis(peptide_probs, probs, check_results)
print(message)
print(most)
print(possible)
More detailed quickstart
(Optional) Check your peptide list for overlap consistency.
Note
Incosistent overlap length can lead to hindered results interpretation.
You can check all peptides for their overlap length with the next peptide (list of peptides should be ordered):
- cpp.all_overlaps(lst) Counter object
- Parameters:
lst (list) – ordered list of peptides
- Returns:
Counter object with the dictionary, where the key is the overlap length and the value is the number of pairs with such overlap.
- Return type:
Counter object
>>> cpp.all_overlaps(lst) Counter({12: 251, 16: 1})
=> 251 pairs of peptides with an overlap of length of 12 amino acids, and 1 pair with an overlap of length 16 amino acids.
Also, you can check which peptides have such an overlap with the next peptide:
- cpp.find_pair_with_overlap(lst, target_overlap) list
- Parameters:
lst (list) – ordered list of peptides
target_overlap (int) – overlap length
- Returns:
list of lists with peptides with specified overlap length.
- Return type:
list
>>> cpp.find_pair_with_overlap(lst, 16) [['FDEDDSEPVLKGVKLHY', 'DEDDSEPVLKGVKLHYT']]
=> Overlap of length 16 amino acids is in peptides FDEDDSEPVLKGVKLHY and DEDDSEPVLKGVKLHYT.
Also, you can check what number of peptides share the same epitope. It might help to interpret the results later.
- cpp.how_many_peptides(lst, ep_length) Counter object, dictionary
- Parameters:
lst (list) – ordered list of peptides
ep_length (int) – expected epitope length
- Returns:
the Counter object with the number of epitopes shared across the number of peptides;
the dictionary with all possible epitopes of expected length as keys and the number of peptides where these epitopes are present as values.
- Return type:
Counter object, dictionary
>>> t, r = cpp.how_many_peptides(lst, 8) >>> t Counter({1: 6, 2: 1256, 3: 4}) >>> r {'MFVFLVLL': 1,'FVFLVLLP': 1,VFLVLLPL': 1,'FLVLLPLV': 1,'LVLLPLVS': 1,'VLLPLVSS': 2, ...,}
=> There are 6 epitopes present in a single peptide, 1256 epitopes present shared by two peptides, and 4 epitopes shared by 4 peptides. For each epitope, number of peptides sharing it is in the dictionary.
(Optional) Then you need to determine peptide occurrence across pools, i.e. to how many pools one peptide would be added.
Note
Peptide occurrence affects number of peptides in one pool, and therefore too high peptide occurrence may lead to higher dilution of a single peptide.
- cpp.find_possible_k_values(n, l) list
- Parameters:
n (int) – number of pools
l (int) – number of peptides
- Returns:
list with possible peptide occurrences given number of pools and number of peptides.
- Return type:
Counter object, dictionary
>>> cpp.find_possible_k_values(12, 250) [4, 5, 6, 7, 8]
=> Given 12 pools and 250 peptides, you can use peptide occurrence equal to 4, 5, 6, 7, 8.
Choose one occurrence value appropriate for your task and proceed.
Now, you need to find the address arrangement given your number of pools, number of peptides, and peptide occurrence.
We suggest you use the
cpp.address_rearrangement_AU()
function. In the section Address arrangement you can find other functions that can perform such a task (based on Gray codes and on a trivial Hamiltonian path search).Note
With large parameters, the algorithm needs some time to finish the arrangement. If the arrangement fails, try with other parameters.
- cpp.address_rearrangement_AU(n_pools, iters, len_lst) list, list
- Parameters:
n_pools (int) – number of pools
iters (int) – peptide occurrence
len_lst (int) – number of peptides
- Returns:
list with number of peptides in each pool;
list with address arrangement
- Return type:
list, list
>>> cpp.address_rearrangement_AU(n_pools=12, iters=4, len_lst=250) >>> b [81, 85, 85, 85, 81, 82, 87, 81, 85, 81, 84, 83] >>> lines [[0, 1, 2, 3],[0, 1, 3, 6],[0, 1, 6, 8],[1, 6, 8, 9],[6, 8, 9, 11], ... ]
=> You will get the expected number of peptides in each pool and address arrangement, which will be used in following steps.
Now, you can distribute peptides across pools using the produced address arrangement. One peptide will be added to one produced address.
Note
Keep in mind that peptides should be ordered as they overlap.
- cpp.pooling(lst, addresses, n_pools) dictionary, dictionary
- Parameters:
lst (list) – ordered list with peptides
addresses (list) – produced address arrangement
n_pools (int) – number of pools
- Returns:
pools – dictionary with keys as pools indices and values as peptides that should be added to this pools;
peptide address – dictionary with peptides as keys and corresponding addresses as values.
- Return type:
dictionary, dictionary
>>> pools, peptide_address = cpp.pooling(lst=lst, addresses=lines, n_pools=12) >>> pools {0: ['MFVFLVLLPLVSSQCVN','VLLPLVSSQCVNLTTRT',VSSQCVNLTTRTQLPPA', ...], 1: ['MFVFLVLLPLVSSQCVN','VLLPLVSSQCVNLTTRT','TQDLFLPFFSNVTWFHA', ...], ... } >>> peptide_address {'MFVFLVLLPLVSSQCVN': [0, 1, 2, 3], 'VLLPLVSSQCVNLTTRT': [0, 1, 2, 10], ... }
=> You will get the pooling scheme and peptide addresses.
Now, you can run the simulation using produced pools and peptide_address.
The simulation produces a DataFrame with every possible epitope of the provided length and all pools where this epitope is present. This table is needed to interpret the results.
The function has two regimes: with and without drop-outs. Without drop-outs, it returns a table as there were no mistakes, and all pools that should be activated were activated. With drop-outs, it returns a table with all possible mistakes (i.e.all possible non-activated pools). This option will need time to be generated, usually several minutes, although it depends on the number of peptides and on occurrence.
- cpp.run_experiment(lst, peptide_address, ep_length, pools, iters, n_pools, regime) pandas DataFrame
Note
Simulation may take several minutes, especially upon “with drop-outs” regime.
- Parameters:
lst (list) – ordered list with peptides
peptide_address (dictionary) – peptides addresses produced by pooling
ep_length (int) – expected epitope length
pools (dictionary) – pools produced by pooling
iters (int) – peptide occurrence
n_pools (int) – number of pools
regime (“with dropouts” or “without dropouts”) – regime of simulation, with or without drop-outs
- Returns:
pandas DataFrame with all possible epitopes of given length and the resulting activated pools
- Return type:
pandas DataFrame
>>> df = cpp.run_experiment(lst=lst, peptide_address=peptide_address, ep_length=8, pools=pools, iters=iters, n_pools=n_pools, regime='without dropouts')
>>> df
Peptide
Address
Epitope
Act Pools
# of pools
# of epitopes
# of peptides
Remained
# of lost
Right peptide
Right epitope
MFVFLVLLPLVSSQCVN
[0, 1, 2, 3]
MFVFLVLL
[0, 1, 2, 3]
4
5
1
–
0
True
True
MFVFLVLLPLVSSQCVN
[0, 1, 2, 3]
MFVFLVLL
[0, 1, 2, 3]
4
5
1
–
0
True
True
…
MFVFLVLLPLVSSQCVN
[0, 1, 2, 3]
VLLPLVSS
[0, 1, 2, 3, 10]
5
5
2
–
0
True
True
…
VLLPLVSSQCVNLTTRT
[0, 1, 2, 10]
VLLPLVSS
[0, 1, 2, 3, 10]
5
5
2
–
0
True
True
…
Peptide — peptide sequence
Address — pool indices where this peptide should be added
Epitope — checked epitope from this peptide
Act pools — list with pool indices where this epitope is present
# of pools — number of pools where this epitope is present
# of epitopes — number of epitopes that are present in the same pools (= number of possible peptides upon activation of such pools)
# of peptides — number of peptides in which there are epitopes that are present in the same pools (= number of possible peptides upon activation of such pools)
Remained — only upon regime=”with dropouts”, list of pools remained after mistake
# of lost — only upon regime=”with dropouts”, number of dropped pools due to mistake
Right peptide — True or False, whether the peptide is present in the list of possible peptides
Right epitope — True or False, whether the peptide is present in the list of possible peptides
To interpret the results of the experiment, you need to find all rows where the “Act Pools” column contains your combination of activated pools. Then, you will know all possible peptides and epitopes that could lead to the activation of such a combination of pools.
If you can not find your combination of activated pools in the table, here is the sequence of actions.
After the experiment, you will know the number of activated pools. This number depends on the length of overlap and the length of the expected epitope. You can check the distribution of epitope presence in your peptides using
cpp.how_many_peptides()
function. The number of activated pools would be equal to peptide occurrence plus one per additional peptide sharing this epitope.This way, if the epitope is present only in 1 peptide (usually, it is the case for epitopes at the ends of the protein), then the number of activated pools is equal to peptide occurrence. If the epitope is present in two peptides, then the number of activated pools is equal to peptide occurrence +1.
If overlap length is consistent across all peptides, then the number of activated pools would be the same for almost all epitopes (except for the epitopes at the ends of the protein). Although even if the overlap is inconsistent, you can use the analysis, but it will hinder the interpretation of the results in some cases.
If a shift length between two peptides is equal to or less than the expected epitope length divided by two, then the number of activated pools should be equal to the peptide occurrence value + 1.
If the number of activated pools is less than according to the rule described above, then three options are possible:
The target peptide is the peptide at the end of your peptide list, and the target epitope is located not in an overlap of this peptide with the next one. This could be checked easily: if your activated pools are not the same as the activated pools for any epitope from the first or last peptide, then you should check our second option.
For the target peptide, overlap with its neighbor is less than usual, and therefore target epitope is not shared by the usual number of peptides. You can check that using
cpp.all_overlaps()
orcpp.how_many_peptides()
. Nevertheless, given the absence of drop-outs, you still should be able to find the target peptide in the table with simulation results by searching for all rows where the “Act Pools” column contains your combination of activated pools.Some pools were not activated, although they should be; then, we recommend using the “with drop-outs” regime of the simulation. It imitates drop-outs of all possible pools, so you should be able to find your case in the resulting table.
If the number of activated pools is higher than according to the rule described above, then two options are possible:
For the target peptide, overlap with its neighbor is bigger than usual, and therefore target epitope is shared between more peptides. You can check that using
cpp.all_overlaps()
orcpp.how_many_peptides()
. Nevertheless, given the absence of drop-outs, you still should be able to find the target peptide in the table with simulation results by searching for all rows where the “Act Pools” column contains your combination of activated pools.Some pools were activated, although they should not be. This issue is not addressed in the package.
>>> df = cpp.run_experiment(lst=lst, peptide_address=peptide_address, ep_length=8, pools=pools, iters=iters, n_pools=n_pools, regime='with dropouts') >>> df
Peptide
Address
Epitope
Act Pools
# of pools
# of epitopes
# of peptides
Remained
# of lost
Right peptide
Right epitope
MFVFLVLLPLVSSQCVN
[0, 1, 2, 3]
MFVFLVLL
[0, 1, 2, 3]
4
40
12
[0, 1, 2]
1
True
False
MFVFLVLLPLVSSQCVN
[0, 1, 2, 3]
MFVFLVLL
[0, 1, 2, 3]
4
76
25
[0, 1, 3]
1
True
False
…
RTQLPPAYTNSFTRGVY
[8, 9, 10, 11]
RTQLPPAY
[0, 8, 9, 10, 11]
5
5
2
[0, 8, 9, 10, 11]
0
True
True
…
RTQLPPAYTNSFTRGVY
[8, 9, 10, 11]
TQLPPAYT
[0, 8, 9, 10, 11]
5
190
53
[8, 9]
3
True
True
…
Peptide — peptide sequence
Address — pool indices where this peptide should be added
Epitope — checked epitope from this peptide
Act pools — list with pool indices where this epitope is present
# of pools — number of pools where this epitope is present
# of epitopes — number of epitopes that are present in the same pools (= number of possible peptides upon activation of such pools)
# of peptides — number of peptides in which there are epitopes that are present in the same pools (= number of possible peptides upon activation of such pools)
Remained — only upon regime=”with dropouts”, list of pools remained after mistake
# of lost — only upon regime=”with dropouts”, number of dropped pools due to mistake
Right peptide — True or False, whether the peptide is present in the list of possible peptides
Right epitope — True or False, whether the peptide is present in the list of possible peptides
Right peptide and Right epitope columns are needed to check the algorithm of dropped pool recovery. Either “Right peptide” or “Right epitope” should contain the value “True”; otherwise, recovery was unsuccessful.
Also, the regime “with drop-outs” can not differentiate between dropped pools due to a mistake and absent pools due to experiment design. This way, for epitopes located at the end of proteins, the algorithm would think that pools were dropped and would try to recover them. Because of that, if you suspect the epitope located at the end of the peptide to be the target epitope, we recommend first using the “without drop-outs” regime. You can look at the sequence of actions described above. The same applies to peptides with longer overlap. So, we strongly recommend using peptides with consistent overlap length.
(Optional) To avoid mixing pools manually, you can print special punch cards using files with their 3D models produced by this step.
One punch card is needed for each pool. Each punch card is a thin card with holes located at the spots where the needed peptides are located in the plate. Therefore, each punch card has the number of holes equal to the number of peptides in a pool. Then, this card should be placed on an empty tip box, and a tip should be inserted into each hole. This way, if you are using a multichannel pipette, all tips are already arranged to take only the required peptides.
[The process you can look up here.]
To generate the files with 3D models, you need two functions.
Note
The rendering of 3D models is a long process, so it could take time.
- cpp.pools_stl(peptides_table, pools, rows=16, cols=24, length=122.10, width=79.97, thickness=1.5, hole_radius=4.0 / 2, x_offset=9.05, y_offset=6.20, well_spacing=4.5) dictionary
- Parameters:
peptides_table (pandas DataFrame) – table representing the arrangement of peptides in a plate, is not produced by any function in the package
pools (pandas DataFrame) – table with a pooling scheme, where one row represents each pool, pool index is the index column, and a string with all peptides added to this pool separated by “;” is “Peptides” column.
rows (int) – int
cols (int) – number of columns in your plate with peptides
length (float) – length of the plate in mm
width (float) – width of the plate in mm
thickness (float) – desired thickness of the punch card, in mm
hole_radius (float) – the radius of the holes, in mm, should be adjusted to fit your tip
x_offset (float) – the margin along the X axis for the A1 hole, in mm
y_offset (float) – the margin along the Y axis for the A1 hole, in mm
well_spacing (float) – the distance between wells, in mm
- Returns:
dictionary with Mesh objects, where key is pool index, and value is a Mesh object of a corresponding punch card.
- Return type:
dictionary
>>> meshes_list = cpp.pools_stl(peptides_table, pools, rows = 16, cols = 24, length = 122.10, width = 79.97, thickness = 1.5, hole_radius = 2.0, x_offset = 9.05, y_offset = 6.20, well_spacing = 4.5)
Now, you need to pass generated dictionary to the function exporting it as a .zip file.
- cpp.zip_meshes_export(meshes_list) None
- Parameters:
meshes_list (dictionary) – dictionary with Mesh objects, generated in previous step
- Returns:
export Mesh objects as STL files in .zip archive.
- Return type:
None
>>> cpp.zip_meshes_export(meshes_list)
=> You will get a .zip archive with generated STL files. Then, you can send these STL files directly to a 3D printer. We recommend writing the index of the pool on the punch card. Also, you can check the generated STL files using OpenSCAD.
To interpret the results, you can use the Bayesian mixture model of activation signal.
Plate notation for the model (for 12 pools and 3 replicas).
- cpp.activation_model(obs, n_pools, inds) fig, pandas DataFrame
Note
Fitting might take several minutes.
- Parameters:
obs (list) – list with observed values
n_pools (int) – number of pools
inds (int) – list with indices for observed values
- Returns:
fig – posterior predictive KDE and observed data KDE
probs – probabilitity for each pool of being drawn from a distribution of activated or non-activated pools
- Return type:
figure, pandas DataFrame
>>> fig, probs = cpp.activation_model(obs, 12, inds)
>>> probs
Pool
assign
0
0.99900
1
1.00000
2
0.00025
3
0.36475
4
0.00025
5
0.00000
6
1.00000
7
1.00000
8
0.99975
9
0.99975
10
0.00000
11
0.99975
The Pool column contains pool index, and column assign the probability of the pools to be drawn from the distribution of non-activated pool. The pool is considered to be activated if assign <= 0.5.
Using this table, you can assess which pools were activated and which were not, and then check the result in check_results table with simulation. However, also you can use the following functions:
- cpp.peptide_probabilities(sim, probs) pandas DataFrame
- Parameters:
sim (pandas DataFrame) – check_results table with simulation with or without drop-outs
probs (pandas DataFrame) – DataFrame with probabilities produced by
cpp.activation_model()
- Returns:
peptide_probs – probabilitity for each peptide to cause such a pattern of activation
- Return type:
pandas DataFrame
>>> peptide_probs = cpp.peptide_probabilities(sim, probs)
>>> peptide_probs
Peptide
Address
Act Pools
Probability
Activated
Non-Activated
MFVFLVLLPLVSSQCVN
[0, 1, 2, 3]
[0, 1, 2, 3]
1.172135e-07
2
2
MFVFLVLLPLVSSQCVN
[0, 1, 2, 3]
[0, 1, 2, 3, 7]
8.262788e-10
2
2
VLLPLVSSQCVNLTTRT
[1, 2, 3, 7]
[0, 1, 2, 3, 7]
8.262788e-10
2
2
VLLPLVSSQCVNLTTRT
[1, 2, 3, 7]
[1, 2, 3, 7, 11]
2.119434e-05
3
3
VSSQCVNLTTRTQLPPA
[2, 3, 7, 11]
[1, 2, 3, 7, 11]
2.119434e-05
3
3
…
…
…
…
…
…
FDEDDSEPVLKGVKLHY
[0, 1, 3, 5]
[0, 1, 2, 3, 4, 5]
3.259596e-08
3
3
FDEDDSEPVLKGVKLHY
[0, 1, 3, 5]
[0, 1, 2, 3, 5]
2.104844e-06
3
2
DEDDSEPVLKGVKLHYT
[0, 1, 2, 5]
[0, 1, 2, 3, 4, 5]
3.259596e-08
3
3
DEDDSEPVLKGVKLHYT
[0, 1, 2, 5]
[0, 1, 2, 3, 5]
2.104844e-06
3
2
DEDDSEPVLKGVKLHYT
[0, 1, 2, 5]
[0, 1, 2, 5]
7.922877e-09
2
2
And then this table can be used to find cognate peptides:
- cpp.results_analysis(peptide_probs, probs, sim) list, list, list
- Parameters:
peptide_probs (pandas DataFrame) – DataFrame with probabilities for each peptide produced by
cpp.peptide_probabilities()
probs (pandas DataFrame) – DataFrame with probabilities produced by
cpp.activation_model()
sim (pandas DataFrame) – check_results table with simulation with or without drop-outs
- Returns:
note about detected drop-outs (erroneously non-activated pools);
list of the most possible peptides;
list of all possible peptides given this pattern of pools activation.
- Return type:
list, list, list
>>> note, most, possible = cpp.peptide_probabilities(sim, probs) >>> note No drop-outs were detected >>> most ['SSANNCTFEYVSQPFLM', 'CTFEYVSQPFLMDLEGK'] >>> possible ['SSANNCTFEYVSQPFLM', 'CTFEYVSQPFLMDLEGK']
Play with the approach using simulated data (Optional)
If you want to play with the approach with the generated data, you can use the following pipeline.
First, you need to determine the parameters for pooling scheme.
how many peptides? (len_lst)
how many pools? (n_pools)
what is peptide occurrence, i.e. to how many pools one peptide would be added? (iters)
what would be the length of the peptide? (pep_length)
what is the length of the shift between two overlapping peptides? (shift)
what is the length of the expected epitope (ep_length, we recommend 8)
Then, you can use these parameters to generate peptides. First, you would need to generate a random sequence, and then you could generate peptides using sliding window approach.
>>> len_lst = 100 >>> n_pools = 12 >>> iters = 4 >>> pep_length = 17 >>> shift = 5 >>> ep_length = 8 >>> sequence = cpp.random_amino_acid_sequence(shift*len_lst + (100-shift*len_lst%100)) >>> sequence 'EMKFLDQSQLGYVHPKWHHGTEMDEWSRSNSAYGKHQEATRLCSQWWVKTYMPTDPCWMLRYTNCCAMVPRYADFCMRDYRYAYIYFVNWNHECSDVIMETCCFALGKKLSTPTCTPGCVTVIYECKSEFEVGWPPHIIEGSAEFYAVACFVTRFMCPQTKANLLKIIISFHLHHYGQAEQICYKNEIPCCAMKFFDHREGLESNCLTCMQWPCNKSLFDPFPVMYRFSMAGNQGEPPCGYAVTMNARCTMGRWQKFRCEFKGCFYHNINVYTGCETMHECQIPVPMVHQTTLLYPCNVRSKDIDPCDWSYLEDDKERGWCGKFQMGSQIFRKFTPPPWTNRGWNHMDDTEARHRWCLTWKFTLDEPAEDTCILWIHSVYLWVVCMQGTAMSMRMVSFTLLCFMRAPPCEVMHYCDPQQTRDEELPMVGYITEELKSMFTSSSWPGSQSPGWGTWDLSIKRHSVKVPDMINPTHVVKPTKCICNQSLGWTFSEIDMYARHDIQKRWKCPIWNGQFRYEVIHSKQNPFQNSDEQPT' ## Then with this sequence you can generate peptides >>> lst_all = [] >>> for i in range(0, len(sequence), overlap): ps = sequence[i:i+pep_length] if len(ps) == pep_length: lst_all.append(ps) >>> lst = lst_all[:len_lst]
Then you can finally generate the pooling scheme.
>>> b, lines = cpp.address_rearrangement_AU(n_pools=n_pools, iters=iters, len_lst=len_lst) >>> pools, peptide_address = cpp.pooling(lst=lst, addresses=lines, n_pools=n_pools) >>> check_results = cpp.run_experiment(lst=lst, peptide_address=peptide_address, ep_length=ep_length, pools=pools, iters=iters, n_pools=n_pools, regime='without dropouts')
Then you need to select a cognate epitope to later check whether the model can recover it. You can do it manually if you particularly like some of them. But also you can do that randomly.
>>> cognate = check_results.sample(1)['Epitope'][0] >>> check_results['Cognate'] = False >>> check_results.loc[check_results['Epitope'] == cognate, 'Cognate'] = True >>> print(list(set(check_results['Peptide'][check_results['Epitope'] == cognate]))) ['YCNQNWDWDMCEVVCGR', 'WDWDMCEVVCGRDFCHC']
Also, you would need to find the pools which would be activated given this epitope is cognate.
>>> inds_p_check = check_results[check_results['Cognate'] == True]['Act Pools'].values[0] >>> inds_p_check = [int(x) for x in inds_p_check[1:-1].split(', ')] >>> inds_n_check = [] >>> for item in range(n_pools): if item not in inds_p_check: inds_n_check.append(item) >>> inds_p_check [5, 6, 9, 10, 11] >>> inds_n_check [0, 1, 2, 3, 4, 7, 8]
Then you can simulate activation signal. For that, you would need to determine paratemers of the model.
mu_n - mu of the negative distribution (distribution of signal of non-activated pools)
sigma_n - sigma of the negative distribution
mu_off - mu of the offset which will be used to obtain positive distribution (distribution of signal of activated pools) from the negative distribution
sigma_off - sigma of the offset which will be used to obtain positive distribution
r - number of replicas in the experiment
p_shape - number of activated pools in simulation, you can make it equal to the number of pools where cognate epitope is present, or you can make more / less to see how the algorithm responds to mistakes.
>>> mu_off = 10 >>> sigma_off = 0.01 >>> mu_n = 5 >>> sigma_n = 1 >>> r = 1 >>> p_shape = len(inds_p_check)
n_shape = n_pools-p_shape >>> p_results, n_results = cpp.simulation(mu_off, sigma_off, mu_n, sigma_n, n_pools, r, iters, p_shape) >>> cells = pd.DataFrame(columns = ['Pool', 'Percentage']) >>> cells['Percentage'] = p_results + n_results >>> cells['Pool'] = inds_p_check*r + inds_n_check*r
Cells is a DataFrame with the simulated data:
>>> cells
Pool
Percentage
5
14.554757
6
14.818329
9
14.846125
10
14.536968
11
15.311202
0
4.544784
1
4.422958
2
4.514103
3
4.458392
4
4.575509
7
5.791510
8
5.334201
Then you can use this table to check the algorithm.
>>> inds = list(cells['Pool']) >>> obs = list(cells['Percentage']) >>> fig, probs = cpp.activation_model(obs, n_pools, inds) >>> peptide_probs = cpp.peptide_probabilities(check_results, probs) >>> cpp.results_analysis(peptide_probs, probs, check_results) ('No drop-outs were detected', ['YCNQNWDWDMCEVVCGR', 'WDWDMCEVVCGRDFCHC'], ['YCNQNWDWDMCEVVCGR', 'WDWDMCEVVCGRDFCHC'])
Now you can compare recovered cognate peptides with ones you chose:
[‘YCNQNWDWDMCEVVCGR’, ‘WDWDMCEVVCGRDFCHC’] - you chose
[‘YCNQNWDWDMCEVVCGR’, ‘WDWDMCEVVCGRDFCHC’] - were recovered by the model from the simulated activation data
You can play with different parameters to check how well the approach works. For example, you can decrease the offset for the positive distribution, to check how different should be activated and non-activated pools to yield correct results.
Peptide occurrence search
- cpp.factorial(num) int
- Parameters:
num – number
- Returns:
factorial of the num
- Return type:
int
>>> cpp.factorial(10) 3628800
- cpp.combination(n, k) int
- Parameters:
n (int) – set length
- Returns:
how many items are selected from the set
- Return type:
int
>>> cpp.combination(10, 3) 120
- cpp.find_possible_k_values(n, l) list
- Parameters:
n (int) – number of pools
l (int) – number of peptides
- Returns:
list with possible peptide occurrences given number of pools and number of peptides.
- Return type:
Counter object, dictionary
>>> cpp.find_possible_k_values(12, 250) [4, 5, 6, 7, 8]
Address arrangement
Note
Method for n-bit balanced Gray code construction is based on the textbook Counting sequences, Gray codes and lexicodes. Method for construction of balanced Gray code with flexible length is based on the paper Balanced Gray Codes With Flexible Lengths.
- cpp.find_q_r(n) tuple
- Parameters:
n (int) – number
- Returns:
solution for the equation 2**n = n*q + r (q, r)
- Return type:
(int, int)
>>> cpp.find_q_r(5) (6, 2)
- cpp.bgc(n, s=None) list
Note
Works only for n=4 and n=5.
- Parameters:
n (int) – number of bits
s (list) – transition sequence for n-2 bit balanced Gray code
- Returns:
transition sequence for n bit balanced Gray code
- Return type:
list
>>> cpp.bgc(4, s = None) [1, 2, 1, 3, 4, 3, 1, 2, 3, 2, 4, 2, 1, 4, 3, 4]
- cpp.n_bgc(n): -> list
- Parameters:
n (int) – number of bits
- Returns:
transition sequence for n bit balanced Gray code
- Return type:
list
>>> cpp.n_bgc(6) [1, 2, 1, 3, 4, 3, 1, 2, 3, 2, 4, 2, 1, 4, 3, 5, 3, 4, 1, 2, 4, 6, 4, 2, 1, 4, 3, 5, 3, 4, 1, 2, 4, 2, 5, 6, 3, 6, 5, 2, 5, 6, 1, 6, 5, 3, 5, 6, 4, 6, 5, 3, 5, 6, 1, 6, 5, 2, 5, 6, 1, 6, 5, 6]
- cpp.computing_ab_i_odd(s_2, l, v): -> list
Note
Intrinsic function for
cpp.m_length_BGC()
, can not be used globally.- Parameters:
s_2 (list) – transition sequence for balanced Gray code with n bits
l (int) – number, correponds to _l_ from the method described by Lu Wang et al., 2016
v (int) – number, correponds to _v_ from the method described by Lu Wang et al., 2016
- Returns:
[v, a_values, E_v]
- Return type:
list
- cpp.m_length_BGC(m, n): -> list
- Parameters:
m (int) – required length of the code
n (int) – number of bits
- Returns:
transition sequence for n bit balanced Gray code of length m
- Return type:
list
>>> cpp.m_length_BGC(m=28, n=5) [0, 1, 2, 3, 2, 1, 0, 4, 0, 1, 2, 3, 2, 1, 0, 1, 3, 4, 2, 4, 3, 1, 3, 4, 0, 4, 3, 4]
- cpp.gc_to_address(s_2, iters, n): -> list
Tip
We do not recommend to use this function for address arrangement since the result might be imbalanced and with other features hindering the interpretation of the experiment.
- Parameters:
s_2 (list) – transition sequence for Gray code
iters (int) – peptide occurrence
n (int) – number of pools
- Returns:
address arrangement based on the produced Gray code
- Return type:
list
>>> cpp.gc_to_address(cpp.m_length_BGC(m=28, n=5), 2, 5) [[0, 4], [2, 4], [2, 3], [3, 4], [0, 3], [0, 2], [1, 3], [1, 2], [1, 4]]
- cpp.union_address(address, union): -> list
- Parameters:
address (string) – address in bit view
union (string) – union in bit view
- Returns:
unions possible after given union and address
- Return type:
list
>>> cpp.union_address('110000', '111000') ['110100', '110010', '110001']
- cpp.address_union(address, union): -> list
- Parameters:
address (string) – address in bit format
union (string) – union in bit format
- Returns:
addresses possible after given address and union
- Return type:
list
>>> cpp.address_union('011000', '111000') ['110000', '101000']
- cpp.hamiltonian_path_AU(size, point, t, unions, path=None): -> list
Note
This function is recursive. It is intrinsic function for
cpp.address_rearrangement_AU()
, though it can work globally.- Parameters:
size (int) – length of the required path
point (string) – union or address that is added currently at this step
t ('a' or 'u') – type of added point (union or address)
unions (list) – unions used in the path
path (list) – addresses used in the path
- Returns:
arrangement of addresses in bit format
- Return type:
list
>>> cpp.hamiltonian_path_AU(size=10, point = '110000', t = 'a', unions = ['111000']) ['110000', '100100', '000110', '000011', '001001', '010001', '010010', '011000', '001100', '101000']
- cpp.variance_score(bit_sums, s): -> float
- Parameters:
bit_sums (list) – current distribution of peptides across pools
s (string) – union or address that is added currently at this step
- Returns:
penalty for balance distortion upon this point addition to the path
- Return type:
float
>>> cpp.variance_score([2, 4, 4, 3, 3, 4], '110001') 0.25
- cpp.return_address_message(code, mode): -> string or list
- Parameters:
code (list of string) – address (for example, [0, 1, 2]) or address in bit format (for example, ‘111000’)
mode ('a' or 'mN') – indicates whether code is address or address in bit format, if latter, than second letter (N) indicates number of pools
- Returns:
corresponding address in bit format (‘111000’) or address ([0, 1, 2])
- Return type:
string or list
>>> cpp.return_address_message([1, 2, 4], 'm7') '0110100' >>> cpp.return_address_message('0111100', 'a') [1, 2, 3, 4]
- cpp.binary_union(bin_list): -> list
- Parameters:
bin_list (list) – list of addresses
- Returns:
list of their unions
- Return type:
list
>>> cpp.binary_union(['110000', '100001', '000101', '000110', '001010', '010010', '010100', '100100', '101000', '001001']) ['110001', '100101', '000111', '001110', '011010', '010110', '110100', '101100', '101001']
- cpp.hamming_distance(s1, s2): -> int
- Parameters:
s1 (string) – address in bit format
s2 (string) – address in bit format
- Returns:
hamming distance between two addresses
- Return type:
int
>>> cpp.hamming_distance('110000', '100001') 2
- cpp.sum_bits(arr): -> list
- Parameters:
arr (list) – current address arrangement in bit format
- Returns:
peptide distribution across pools given this arrangement
- Return type:
list
>>> cpp.sum_bits(['110001', '100101', '000111', '001110', '011010', '010110', '110100', '101100', '101001']) [5, 4, 4, 6, 4, 4]
- cpp.hamiltonian_path_A(G, size, pt, path=None): -> list
Note
This function is recursive. It is intrinsic function for
cpp.address_rearrangement_A()
, though it can work globally.- Parameters:
size (int) – graph representing peptide space
size – length of the required path
pt (string) – union or address that is added currently at this step
path (list) – addresses used in the path
- Returns:
arrangement of addresses in bit format
- Return type:
list
>>> cpp.hamiltonian_path_A(G = G, size = 10, pt = '11000', path=None) ['11000', '01100', '00101', '00011', '10010', '00110', '01010', '01001', '10001', '10100']
- cpp.address_rearrangement_AU(n_pools, iters, len_lst) list, list
Note
Search for arrangement may take some time, especially with large parameters. Although, this function is faster than
cpp.address_rearrangement_A()
, since it considers both vertices and edges as it traverses the graph.- Parameters:
n_pools (int) – number of pools
iters (int) – peptide occurrence
len_lst (int) – number of peptides
- Returns:
list with number of peptides in each pool;
list with address arrangement, uses both unions and addresses for its construction
- Return type:
list, list
>>> cpp.address_rearrangement_AU(n_pools=12, iters=4, len_lst=250) >>> b [81, 85, 85, 85, 81, 82, 87, 81, 85, 81, 84, 83] >>> lines [[0, 1, 2, 3],[0, 1, 3, 6],[0, 1, 6, 8],[1, 6, 8, 9],[6, 8, 9, 11], ... ]
- cpp.address_rearrangement_A(n_pools, iters, len_lst): -> list, list
Note
Search for arrangement may take some time, especially with large parameters. This function is slower than
cpp.address_rearrangement_AU()
, since it considers only vertices as it traverses the graph.- Parameters:
n_pools (int) – number of pools
iters (int) – peptide occurrence
len_lst (int) – number of peptides
- Returns:
list with number of peptides in each pool;
list with address arrangement, uses both unions and addresses for its construction
- Return type:
list, list
>>> cpp.address_rearrangement_A(n_pools=12, iters=4, len_lst=250) >>> b [82, 83, 85, 85, 83, 83, 84, 81, 83, 83, 84, 84] >>> lines [[0, 1, 2, 3],[0, 2, 3, 7],[0, 3, 7, 11],[0, 7, 10, 11],[7, 8, 10, 11], ... ]
Peptide overlap
- cpp.string_overlap(str1, str2): -> int
- Parameters:
str1 (string) – peptide
str2 (string) – peptide
- Returns:
overlap length between two peptides
- Return type:
int
>>> cpp.string_overlap('ASDFGHJKTYUIO', 'GHJKTYUIOTYUI') 9
- cpp.find_pair_with_overlap(lst, target_overlap) list
- Parameters:
lst (list) – ordered list of peptides
target_overlap (int) – overlap length
- Returns:
list of lists with peptides with specified overlap length.
- Return type:
list
>>> cpp.find_pair_with_overlap(lst, 16) [['FDEDDSEPVLKGVKLHY', 'DEDDSEPVLKGVKLHYT']]
- cpp.how_many_peptides(lst, ep_length) Counter object, dictionary
- Parameters:
lst (list) – ordered list of peptides
ep_length (int) – expected epitope length
- Returns:
the Counter object with the number of epitopes shared across the number of peptides;
the dictionary with all possible epitopes of expected length as keys and the number of peptides where these epitopes are present as values.
- Return type:
Counter object, dictionary
>>> t, r = cpp.how_many_peptides(lst, 8) >>> t Counter({1: 6, 2: 1256, 3: 4}) >>> r {'MFVFLVLL': 1,'FVFLVLLP': 1,VFLVLLPL': 1,'FLVLLPLV': 1,'LVLLPLVS': 1,'VLLPLVSS': 2, ...,}
Pooling and simulation
- cpp.bad_address_predictor(all_ns): -> list
Tip
Initially it is designed for address arrangement produced by
cpp.gc_to_address()
. But keep in mind that produced arrangement might be imbalanced.- Parameters:
all_ns (list) – address arrangement
- Returns:
address arrangement without addresses with the same unions. The function searches for three consecutive addresses with the same union and removes the middle one.
- Return type:
list
>>> cpp.bad_address_predictor([[0, 1, 2, 3], [0, 1, 2, 4], [0, 1, 2, 5], [0, 1, 2, 6], [0, 1, 3, 6], [0, 1, 3, 5], [0, 1, 3, 4]]) [[0, 1, 2, 3], [0, 1, 2, 4], [0, 1, 2, 5], [0, 1, 2, 6], [0, 1, 3, 6], [0, 1, 3, 5], [0, 1, 3, 4]]
- cpp.pooling(lst, addresses, n_pools) dictionary, dictionary
- Parameters:
lst (list) – ordered list with peptides
addresses (list) – produced address arrangement
n_pools (int) – number of pools
- Returns:
pools – dictionary with keys as pools indices and values as peptides that should be added to this pools;
peptide address – dictionary with peptides as keys and corresponding addresses as values.
- Return type:
dictionary, dictionary
>>> pools, peptide_address = cpp.pooling(lst=lst, addresses=lines, n_pools=12) >>> pools {0: ['MFVFLVLLPLVSSQCVN','VLLPLVSSQCVNLTTRT',VSSQCVNLTTRTQLPPA', ...], 1: ['MFVFLVLLPLVSSQCVN','VLLPLVSSQCVNLTTRT','TQDLFLPFFSNVTWFHA', ...], ... } >>> peptide_address {'MFVFLVLLPLVSSQCVN': [0, 1, 2, 3], 'VLLPLVSSQCVNLTTRT': [0, 1, 2, 10], ... }
- cpp.pools_activation(pools, epitope): -> list
- Parameters:
pools (dictionary) – pools, produced by
cpp.pooling()
epitope (string) – epitope present in one or several tested peptides
- Returns:
pool indices where the epitope is present
- Return type:
list
>>> cpp.pools_activation(pools, 'LGVYYHKN') [0, 3, 8, 9, 11]
- cpp.epitope_pools_activation(peptide_address, lst, ep_length): -> dictionary
- Parameters:
peptide_address (dictionary) – peptide addresses, produced by
cpp.pooling()
lst (list) – ordered list of peptides
ep_length (ep) – expected epitope length
- Returns:
activated pools for every possible epitope of expected length from entered peptides
- Return type:
dictionary
>>> cpp.epitope_pools_activation(peptide_address, lst, 8) {'[0, 1, 2, 3]': ['MFVFLVLL', 'FVFLVLLP', 'VFLVLLPL', 'FLVLLPLV', 'LVLLPLVS'], '[0, 1, 2, 3, 9]': ['VLLPLVSS', 'LLPLVSSQ', 'LPLVSSQC', 'PLVSSQCV', 'LVSSQCVN'], '[0, 1, 3, 9, 11]': ['VSSQCVNL', 'SSQCVNLT', ...], ... }
- cpp.peptide_search(lst, act_profile, act_pools, iters, n_pools, regime): -> list, list
- Parameters:
lst (list) – ordered list of peptides
act_profile (dictionary) – activated pools for every possible epitope of expected length from entered peptides, produced by
cpp.epitope_pools_activation()
act_pools (list) – activated pools
iters (int) – peptide occurrence
n_pools (int) – number of pools
regime ("with dropouts" or "without dropouts") – regime of simulation, with or without drop-outs
- Returns:
possible peptides and possible epitopes given such activated pools
- Return type:
list, list
>>> cpp.peptide_search(lst, act_profile, [0, 3, 8, 9, 11], 4, 12, 'without dropouts') (['CNDPFLGVYYHKNNKSW', 'LGVYYHKNNKSWMESEF'], ['LGVYYHKN', 'GVYYHKNN', 'VYYHKNNK', 'YYHKNNKS', 'YHKNNKSW']) >>> cpp.peptide_search(lst, act_profile, [0, 3, 8, 11], iters, n_pools, 'with dropouts') (['CNDPFLGVYYHKNNKSW', 'LLKYNENGTITDAVDCA', 'LGVYYHKNNKSWMESEF', 'QPRTFLLKYNENGTITD'], ['YNENGTIT', 'LKYNENGT', 'YHKNNKSW', 'KYNENGTI', 'YYHKNNKS', 'LGVYYHKN', 'VYYHKNNK', 'NENGTITD', 'LLKYNENG', 'GVYYHKNN'])
- cpp.run_experiment(lst, peptide_address, ep_length, pools, iters, n_pools, regime) pandas DataFrame
Note
Simulation may take several minutes, especially upon “with drop-outs” regime.
- Parameters:
lst (list) – ordered list with peptides
peptide_address (dictionary) – peptides addresses produced by pooling
ep_length (int) – expected epitope length
pools (dictionary) – pools produced by pooling
iters (int) – peptide occurrence
n_pools (int) – number of pools
regime (“with dropouts” or “without dropouts”) – regime of simulation, with or without drop-outs
- Returns:
pools – dictionary with keys as pools indices and values as peptides that should be added to this pools;
peptide address – dictionary with peptides as keys and corresponding addresses as values.
- Return type:
dictionary, dictionary
>>> df = cpp.run_experiment(lst=lst, peptide_address=peptide_address, ep_length=8, pools=pools, iters=iters, n_pools=n_pools, regime='without dropouts')
3D models
- cpp.stl_generator(rows, cols, length, width, thickness, hole_radius, x_offset, y_offset, well_spacing, coordinates): -> Mesh object
- Parameters:
rows (int) – int
cols (int) – number of columns in your plate with peptides
length (float) – length of the plate in mm
width (float) – width of the plate in mm
thickness (float) – desired thickness of the punch card, in mm
hole_radius (float) – the radius of the holes, in mm, should be adjusted to fit your tip
x_offset (float) – the margin along the X axis for the A1 hole, in mm
y_offset (float) – the margin along the Y axis for the A1 hole, in mm
well_spacing (float) – the distance between wells, in mm
coordinates (list) – coordinates of holes, in tuples in list
- Returns:
punch cards with holes based in entered coordinates
- Return type:
Mesh object
>>> cpp.stl_generator(rows = 16, cols = 24, length = 122.10, width = 79.97, thickness = 1.5, hole_radius = 4.0 / 2, x_offset = 9.05, y_offset = 6.20, well_spacing = 4.5, [(1, 1), (2, 2), (1, 2)]) Mesh object
- cpp.pools_stl(peptides_table, pools, rows=16, cols=24, length=122.10, width=79.97, thickness=1.5, hole_radius=4.0 / 2, x_offset=9.05, y_offset=6.20, well_spacing=4.5) dictionary
Note
Rendering of 3D models will take some time.
- Parameters:
peptides_table (pandas DataFrame) – table representing the arrangement of peptides in a plate, is not produced by any function in the package
pools (pandas DataFrame) – table with a pooling scheme, where one row represents each pool, pool index is the index column, and a string with all peptides added to this pool separated by “;” is “Peptides” column.
rows (int) – int
cols (int) – number of columns in your plate with peptides
length (float) – length of the plate in mm
width (float) – width of the plate in mm
thickness (float) – desired thickness of the punch card, in mm
hole_radius (float) – the radius of the holes, in mm, should be adjusted to fit your tip
x_offset (float) – the margin along the X axis for the A1 hole, in mm
y_offset (float) – the margin along the Y axis for the A1 hole, in mm
well_spacing (float) – the distance between wells, in mm
- Returns:
dictionary with Mesh objects, where key is pool index, and value is a Mesh object of a corresponding punch card.
- Return type:
dictionary
>>> meshes_list = cpp.pools_stl(peptides_table, pools, rows = 16, cols = 24, length = 122.10, width = 79.97, thickness = 1.5, hole_radius = 2.0, x_offset = 9.05, y_offset = 6.20, well_spacing = 4.5)
Generated STL file you can check using OpenSCAD:
- cpp.zip_meshes_export(meshes_list) None
- Parameters:
meshes_list (dictionary) – dictionary with Mesh objects, generated by
cpp.pools_stl()
- Returns:
export Mesh objects as STL files in .zip archive.
- Return type:
None
>>> cpp.zip_meshes_export(meshes_list)
- cpp.zip_meshes(meshes_list): -> BytesIO object
- Parameters:
meshes_list (dictionary) – dictionary with Mesh objects, generated by
cpp.pools_stl()
- Returns:
zip archive with generated STL files in BytesIO format (suitable for emails)
- Return type:
BytesIO
>>> cpp.zip_meshes(meshes_list) <_io.BytesIO at 0x1d42a1440>
Results interpretation with a Bayesian mixture model
- cpp.activation_model(obs, n_pools, inds, cores) fig, pandas DataFrame
Note
Fitting might take several minutes.
- Parameters:
obs (list) – list with observed values
n_pools (int) – number of pools
inds (1, int) – list with indices for observed values
cores – number of cores
- Returns:
fig – posterior predictive KDE and observed data KDE
probs – probabilitity for each pool of being drawn from a distribution of activated or non-activated pools
- Return type:
figure, pandas DataFrame
>>> fig, probs = cpp.activation_model(obs, 12, inds)
- cpp.peptide_probabilities(sim, probs) pandas DataFrame
- Parameters:
sim (pandas DataFrame) – check_results table with simulation with or without drop-outs
probs (pandas DataFrame) – DataFrame with probabilities produced by
cpp.activation_model()
- Returns:
peptide_probs – probabilitity for each peptide to cause such a pattern of activation
- Return type:
pandas DataFrame
>>> peptide_probs = cpp.peptide_probabilities(sim, probs)
- cpp.results_analysis(peptide_probs, probs, sim) list, list, list
- Parameters:
peptide_probs (pandas DataFrame) – DataFrame with probabilities for each peptide produced by
cpp.peptide_probabilities()
probs (pandas DataFrame) – DataFrame with probabilities produced by
cpp.activation_model()
sim (pandas DataFrame) – check_results table with simulation with or without drop-outs
- Returns:
note about detected drop-outs (erroneously non-activated pools);
list of the most possible peptides;
list of all possible peptides given this pattern of pools activation.
- Return type:
list, list, list
>>> note, most, possible = cpp.peptide_probabilities(sim, probs) >>> note No drop-outs were detected >>> most ['SSANNCTFEYVSQPFLM', 'CTFEYVSQPFLMDLEGK'] >>> possible ['SSANNCTFEYVSQPFLM', 'CTFEYVSQPFLMDLEGK']
Data simulation with Bayesian mixture model
- cpp.random_amino_acid_sequence(length) str
- Parameters:
length (int) – length of the random amino acid sequence from which peptides would be generated, calculate how long it should be for your number of peptides
- Returns:
generated amino acid sequence of determined length
- Return type:
str
>>> sequence = cpp.random_amino_acid_sequence(shift*len_lst + (100-shift*len_lst%100)) >>> sequence 'EMKFLDQSQLGYVHPKWHHGTEMDEWSRSNSAYGKHQEATRLCSQWWVKTYMPTDPCWMLRYTNCCAMVPRYADFCMRDYRYAYIYFVNWNHECSDVIMETCCFALGKKLSTPTCTPGCVTVIYECKSEFEVGWPPHIIEGSAEFYAVACFVTRFMCPQTKANLLKIIISFHLHHYGQAEQICYKNEIPCCAMKFFDHREGLESNCLTCMQWPCNKSLFDPFPVMYRFSMAGNQGEPPCGYAVTMNARCTMGRWQKFRCEFKGCFYHNINVYTGCETMHECQIPVPMVHQTTLLYPCNVRSKDIDPCDWSYLEDDKERGWCGKFQMGSQIFRKFTPPPWTNRGWNHMDDTEARHRWCLTWKFTLDEPAEDTCILWIHSVYLWVVCMQGTAMSMRMVSFTLLCFMRAPPCEVMHYCDPQQTRDEELPMVGYITEELKSMFTSSSWPGSQSPGWGTWDLSIKRHSVKVPDMINPTHVVKPTKCICNQSLGWTFSEIDMYARHDIQKRWKCPIWNGQFRYEVIHSKQNPFQNSDEQPT'
- cpp.simulation(mu_off, sigma_off, mu_n, sigma_n, n_pools, r, iters, p_shape, cores=1) list, list
Note
Generation might take several minutes.
- Parameters:
mu_off (float, from 0 to 100) – mu of the Normal distribution for the offset.
sigma_off (float, from 0 to 100) – sigma of the Normal distribution for the offset.
mu_n (float, from 0 to 100) – mu of the Truncated Normal distribution for the negative source (non-activated pools).
sigma_n (float, from 0 to 100) – sigma of the Truncated Normal distribution for the negative source.
r (int) – number of replicas for each pool
iters (int) – number of pools the experiment
iters – peptide occurrence in the pooling scheme, less than n_pools
p_shape (int) – number of activated pools
cores (1, int) – number of cores
- Returns:
p_results - averaged across 4 chains data for activated pools;
n_results - averaged across 4 chains data for non-activated pools.
- Return type:
list, list
>>> p_results, n_results = cpp.simulation(10, 0.01, 5, 1, 12, 1, 4, 5) >>> p_results [14.554757492774076, 14.818328502490942, 14.846124806885513, 14.53696797679254, 15.311202071456592] >>> n_results [4.544784388034261, 4.422957960260396, 4.514103073799207, 4.458391656911868, 4.575509389904373, 5.791510168841456, 5.334200680346714]